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Preface to First Edition 



The purpose of this book is to exposit, as simply as possible, some 
recent results obtained by a number of researchers in the application of 
optimal control theory to management science. We believe that these re- 
sults are very important and deserve to be widely known by management 
scientists, mathematicians, engineers, economists, and others. Because 
the mathematical background required to use this book is two or three 
semesters of calculus plus some differential equations and linear algebra, 
the book can easily be used to teach a course in the junior or senior 
undergraduate years or in the early years of graduate work. For this 
purpose, we have included numerous worked-out examples in the text, 
as well as a fairly large number of exercises at the end of each chapter. 
Answers to selected exercises are included in the back of the book. A 
solutions manual containing completely worked-out solutions to all of 
the 205 exercises is also available to instructors. 

The emphasis of the book is not on mathematical rigor, but on mod- 
ehng realistic situations faced in business and management. For that 
reason, we have given in Chapters 2 and 7 only heuristic proofs of the 
continuous and discrete maximum principles, respectively. In Chapter 3 
we have summarized, as succinctly as we can, the most important model 
types and terminal conditions that have been used to model management 
problems. We found it convenient to put a summary of almost all the 
important management science models on two pages: see Tables 3.1 and 
3.3. 

One of the fascinating features of optimal control theory is the ex- 
traordinarily wide range of its possible applications. We have tried to 
cover a wide variety of applications as follows: Chapter 4 covers finance; 
Chapter 5 considers production and inventory; Chapter 6 covers mar- 
keting; Chapter 8 treats machine maintenance and replacement; Chap- 
ter 9 deals with problems of optimal consumption of natural resources 
(renewable or exhaustible); and Chapter 10 discusses several economic 
applications. 

In Chapter 11 we treat some computational algorithms for solving 
optimal control problems. This is a very large and important area that 
needs more development. 




XIV 



Preface to First Edition 



Chapter 12 treats several more advanced topics of optimal con- 
trol: differential games, distributed parameter systems, optimal filtering, 
stochastic optimal control, and impulsive control. We beheve that some 
of these models are capable of wider applications and further theoretical 
development. 

Finally, four appendixes cover either elementary material, such as 
differential equations, or advanced material, whose inclusion in the main 
text would spoil its continuity. Also at the end of the book is a bibliogra- 
phy of works actually cited in the text. While it is extensive, it is by no 
means an exhaustive bibliography of management science applications 
of optimal control theory. Several surveys of such applications, which 
contain many other important references, are cited. 

We have benefited greatly during the writing of this book by hav- 
ing discussions with and obtaining suggestions from various colleagues 
and students. Our special thanks go to Gustav Feichtinger for his care- 
ful reading and suggestions for improvement of the entire book. Carl 
Norstrdm contributed two examples to Chapters 4 and 5 and made many 
suggestions for improvement. Jim Bookbinder used the manuscript for 
a course at the University of Toronto, and Tom Morton suggested some 
improvements for Chapter 5. The book has also benefited greatly from 
various coauthors with whom we have done research over the years. Both 
of us also have received munerous suggestions for improvements from the 
students in our applied control theory courses taught during the past sev- 
eral years. We would like to express our gratitude to all these people for 
their help. 

The book has gone through several drafts, and we are greatly in- 
debted to Eleanor Balocik and Rosilita Jones for their patience and 
careful typing. 

Although the applications of optimal control theory to management 
science are recent and many fascinating applications have already been 
made, we believe that much remains to be done. We hope that this book 
will contribute to the popularity of the area and will enhance future 
developments. 



Toronto, August 1981 
Pittsburgh, August 1981 



Suresh P. Sethi 
Gerald L. Thompson 




Preface to Second Edition 



The first edition of this book, which provided an introduction to op- 
timal control theory and its applications to management science to many 
students in management, industrial engineering, operations research and 
economics, went out of print a number of years ago. Over the years we 
have received feedback concerning its contents from a number of instruc- 
tors who taught it, and students who studied from it. We have also kept 
up with new results in the area as they were published in the literature. 
For this reason we felt that now was a good time to come out with a 
new edition. While some of the basic material remains, we have made 
several big changes and many small changes which we feel will make the 
use of the book easier. 

The most visible change is that the book is written in Latex and the 
figures are drawn in CorelDRAW, in contrast to the typewritten text 
and hand-drawn figures of the first edition. We have also included some 
problems along with their numerical solutions obtained using EXCEL. 

The most important change is the division of the material in the old 
Chapter 3, into Chapters 3 and 4 in the new edition. Chapter 3 now 
contains models having mixed (control and state) constraints, current 
value formulations, terminal conditions and model types, while Chapter 
4 covers the more difficult topic of pure state constraints, together with 
mixed constraints. Each of these chapters contain new results that were 
not available when the first edition was published. 

The second most important change is the expansion of the mate- 
rial in the old Section 12.4 on stochastic optimal control theory and its 
becoming the new Chapter 13. The new Chapter 12 now contains the 
following advanced topics on optimal control theory: differential games, 
distributed parameter systems, and impulse control. The new Chapter 
13 provides a brief introduction to stochastic optimal control problems. 
It contains formulations of simple stochastic models in production, mar- 
keting and finance, and their solutions. We deleted the old Chapter 11 of 
the first edition on computational methods, since there are a number of 
excellent references now available on this topic. Some of these references 
are listed in Section 4.2 of Chapter 4 and Section 8.3 of Chapter 8. 
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Preface to Second Edition 



The emphasis of this book is not on mathematical rigor, but rather 
on developing models of realistic situations faced in business and man- 
agement. For that reason we have given, in Chapters 2 and 8, proofs 
of the continuous and discrete maximum principles by using dynamic 
programming and Kuhn- Tucker theory, respectively. More general max- 
imum principles are stated without proofs in Chapters 3, 4 and 12. 

One of the fascinating features of optimal control theory is its ex- 
traordinarily wide range of possible applications. We have covered some 
of these as follows: Chapter 5 covers finance; Chapter 6 considers pro- 
duction and inventory problems; Chapter 7 covers marketing problems; 
Chapter 9 treats machine maintenance and replacement; Chapter 10 
deals with problems of optimal consumption of natural resources (renew- 
able or exhaustible); and Chapter 11 discusses a number of applications 
of control theory to economics. The contents of Chapters 12 and 13 have 
been described earlier. 

Finally, four appendices cover either elementary material, such as 
the theory of differential equations, or very advanced material, whose 
inclusion in the main text would interrupt its continuity. At the end 
of the book is an extensive but not exhaustive bibliography of relevant 
material on optimal control theory including surveys of material devoted 
to specific applications. 

We are deeply indebted to many people for their part in making this 
edition possible. Onur Arugaslan, Didem Demirhan, Gustav Feichtinger, 
Neil Geismar, Richard Hartl, Hong Jiang, Steffen Jprgensen, Subodha 
Kumar, Helmut Maurer, Gerhard Sorger, and Denny Yeh made helpful 
comments and suggestions about the first edition or preliminary chapters 
of this revision. Many students who used the first edition, or preliminary 
chapters of this revision, also made suggestions for improvements. We 
would like to express our gratitude to all of them for their help. In 
addition we express otu appreciation to Eleanor Balocik, Prank (Youhua) 
Chen, Feng Cheng, Howard Chow, Barbara Gordon, Jiong Jiang, Kuntal 
Kotecha, Ming Tam, and Srinivasa Yarrakonda for their typing of the 
various drafts of the manuscript. They were advised by Dirk Beyer, Feng 
Cheng, Subodha Kumar, Young Ryu, Chelliah Sriskandarajah, Wulin 
Suo, Houmin Yan, Hanqin Zhang, and Qing Zhang on the technical 
problems of using LATEX. 

We also thank our wives and children — Andrea, Chantal, Anjuli, 
Dorothea, AUison, Emily, and Abigail — for their encouragement and un- 
derstanding during the time-consuming task of preparing this revision. 




Preface to Second Edition xvii 

Finally, while we regret that lack of time and pressure of other du- 
ties prevented us from bringing out a second edition soon after the first 
edition went out of print, we sincerely hope that the wait has been worth- 
while. In spite of the numerous applications of optimal control theory 
which already have been made to areas of management science and eco- 
nomics, we continue to believe there is much more that remains to be 
done. We hope the present revision will rekindle interest in furthering 
such applications, and will enhance the continued development in the 
field. 



Richardson, TX, January, 2000 
Pittsburgh, PA, January, 2000 



Suresh P. Sethi 
Gerald L. Thompson 




Chapter 1 



What is Optimal Control 
Theory? 



Many management science applications involve the control of dynamic 
systems, i.e., systems that evolve over time. They are called continuous- 
time systems or discrete-time systems depending on whether time varies 
continuously or discretely. We shall deal with both kinds of systems 
in this book, although the main emphasis will be on continuous-time 
systems. 

Optimal control theory is a branch of mathematics developed to find 
optimal ways to control a dynamic system. The purpose of this book is 
to give an elementary introduction to mathematical theory and then to 
apply it to a wide variety of different situations arising in management 
science. We have deliberately kept the level of mathematics as simple as 
possible in order to make the book accessible to a large audience. The 
only mathematical requirements for this book are elementary calculus, 
including partial differentiation, some knowledge of vectors and matrices, 
and elementary ordinary and partial differential equations. Moreover, 
the last topic is briefly covered in Appendix A. An exception is Chapter 
13 on stochastic optimal control, which also requires some concepts in 
stochastic calculus. Those concepts will be introduced at the beginning 
of that chapter. 

The principle management science applications discussed in this book 
are in the following areas: finance, economics, production and inventory, 
marketing, maintenance and replacement, and the consumption of natu- 
ral resources. In each major area we have formulated one or more simple 
models followed by a more complicated model. The reader may wish at 
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first reading to cover only the simpler models in each area to get an over- 
all idea of what is possible to do with optimal control theory. Later the 
reader may wish to go into more depth into one or more of the applied 
areas. 

Worked-out examples are provided in most chapters to facilitate the 
exposition. At the end of each chapter, we have listed exercises that the 
reader should solve for deeper understanding of the material presented in 
the chapter. Hints are supplied with some of the exercises. Furthermore, 
difficult exercises are indicated with an asterisk (*). 

1.1 Basic Concepts and Definitions 

We shall use the word system as a primitive term in this book. The only 
property that we require of a system is that it is capable of existing in 
various states. Let the (real) variable x{t) be the state variable of the 
system at time t £ [0,T], where T > 0 is a specified time horizon for 
the system under consideration. For example, x{t) could measure the 
inventory level at time t, the amount of advertising goodwill at time t, 
or the amoimt of unconsumed wealth or natural resources at time t. 

We assume that there is a way of controlhng the state of the system. 
Let the (real) variable u(t) be the control variable of the system at time t. 
For example, u{t) could be the production rate at time t, the advertising 
rate at time t, etc. 

Given the values of the state variable x(t) and the control variable 
u(t) at time t, the state equation, a differential equation, 

x{t) == f{x{t),u(t),t), j:(0) = xo, (1.1) 

specifies the instantaneous rate of change in the state variable, where 
x{t) is a commonly used notation for dx(t)ldt, / is a given function of 
X, u, and t, and xq is the initial value of the state variable. If we know 
the initial value xq and the control trajectory, i.e., the values of u{t) over 
the whole time interval < t < T, then we can integrate (1.1) to get 
the state trajectory, i.e., the values of x{t) over the same time interval. 
We want to choose the control trajectory so that the state and control 
trajectories maximize the objective functional, or simply the objective 
function, 

J = [ F(x(t),u(t),t)dtTS[x{T),T]. (1.2) 

Jo 

In (1.2), F is a given function of x, u, and t, which could measiue 
the benefit minus the cost of advertising, the utility of consumption, the 
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negative of the cost of inventory and production, etc. Also in (1.2), the 
function S gives the salvage value of the ending state x{T) at time T. 
The salvage value is needed so that the solution will make “good sense” 
at the end of the horizon. 

Usually the control variable u{t) will be constrained. We indicate 
this as 

u{t)eD{t), iE[0,T], (1.3) 

where 0(t) is the set of possible values of the control variable at time t. 

Optimal Control problems involving (1.1), (1-2), and (1.3) will be 
treated in Chapter 2. 

In Chapter 3, we shall replace (1.3) by inequality constraints involv- 
ing control variables. In addition, we shall allow these constraints to 
depend on state variables. These are called mixed inequality constraints 
and written as 

g{x{t),u{t),t)>f), t€[0,T], (1.4) 

where y is a given function of u, t and possibly x. 

There is yet another type of constraints involving only state variables 
(but not control variables). These are written as 

h{x,t) > 0, t G [0,T], (1.5) 

where h is a given function of x and t. These are the most difficult 
type of constraints to deal with, and are known as pure state inequality 
constraints. Problems involving (1.1), (1.2), (1.4), and (1.6) will be 
treated in Chapter 4. 

Finally, we note that these constraints, when imposed, limit the val- 
ues the terminal state x{T) may take. We denote this by saying 

x(T) e X(T), (1.6) 

where X(T) is called the reachable set of the state variable at time T. 
Note that X(T) depends on the initial value xq. Here X(T) is the set 
of possible terminal values that can be reached when x{t) and u(t) obey 
imposed constraints. 

Although the above description of the control problem may seem 
abstract, you will find that in each specific application, the various vari- 
ables and parameters will have specific meanings, which make them easy 
to understand and remember. The examples to be discussed next will 
illustrate this point. 
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1.2 Formulation of Simple Control Models 

We now formulate three simple models chosen from the areas of produc- 
tion, advertising, and economics. Our only objective here is to identify 
and interpret in these models each of the variables and functions de- 
scribed in the previous section. The solutions for each of these models 
will be given in detail in later chapters. 



Example 1.1 ^4 Production- Inventory Model The various quantities 
that define this model are summarized in Table 1.1 for easy comparison 
with the other models that follow. 



State Variable 


I(t) = Inventory Level 


Control Variable 


P(^) = Production Rate 


State Equation 


i{t) = p(t) - s{t), 1 ( 0 ) = lo 


Objective Function 


Maximize lj = J —^h(I(t)) -t- c{P(t))]dt 1 


State Constraint 


I(t) > 0 


Control Constraints 


0 < Pmto < P(t) < Pmax 


Terminal Condition 


I(T) > /rain 


Exogenous Functions 


S{t) = Demand Rate 

h{I) = Inventory Holding Cost 

c{P) = Production Cost 


Parameters 


T = Terminal Time 

7min = Minimum Ending Inventory 

T^nin = Minimum Possible Production Rate 

T^max ~ Maximum Possible Production Rate 

Iq = Initial Inventory Level 



Table 1.1: The Production-Inventory Model of Example 1.1 



We consider the production and inventory storage of a given good, 
say of steel, in order to meet an exogenous demand. The state variable 
I{t) measures the number of tons of steel we have on hand at time t. 
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There is an exogenous demand rate S(t) at time t, measured in tons of 
steel per day, and we must choose the production rate P(t) at time t, 
also measured in tons of steel per day. Given initial inventory of Iq tons 
of steel on hand at t = 0, the state equation 

describes how the steel inventory changes over time. Since h{I) is the 
cost of holding inventory I in dollars per day, and c(P) is the cost of pro- 
ducing steel at rate P, also in dollars per day, the objective function is 
to maximize the negative of the sum of the total holding and production 
costs over the time interval [0,T]. Of course, maximizing the negative 
sum is the same as minimizing the siun of holding and production costs. 
The state variable constraint, I{t) > 0, is imposed so that the demand 
is satisfied for all t. In other words, hacklogging of demand is not per- 
mitted. (An alternative formulation is to make h(I) become very large 
when I becomes negative, i.e., to impose a stockout penalty cost.) The 
control constraints keep the production rate P(t) between a specified 
lower bound Pmm and a specified upper bound Pmax- Finally, the termi- 
nal constraint I{T) > Ijain is imposed so that the terminal inventory is 
at least 

The statement of the problem is lengthy because of the number of 
variables, functions, and parameters which are involved. However, with 
the production and inventory interpretations as given, it is not difficult 
to see the reasons for each condition. In Chapter 6, various versions of 
this model wiU be solved in detail. In Section 13.3, we shall deal with a 
stochastic version of this model. 

Example 1.2 An Advertising Model. The various quantities that define 
this model are summarized in Table 1.2. 

We consider a special case of the Nerlove-Arrow advertising model 
which will be discussed in detail in Chapter 7. The problem is to de- 
termine the rate at which to advertise a product at each time t. Here 
the state variable is advertising goodwill, G{t), which measures how well 
the product is known at time t. We assume that there is a forgetting 
coefficient 6 , which measures the rate at which customers tend to forget 
the product. To counteract forgetting, advertising is carried out at a 
rate measured by the control variable u{t). Hence, the state equation is 

G{t) = u{t) - 6G{t), 

with G(0) = Go > 0 specifying the initial goodwill for the product. 
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State Variable 


G{t) = Advertising Goodwill 


Control Variable 


u{t) = Advertising Rate 


State Equation 


G(t) = u(t) - SG(t), G(0) = Go 


Objective Function 


Maximize |j = j [7r(C(t)) — w(i)]e“^*‘^^| 


State Constraint 




Control Constraints 


0 < u{t) < Q 


Terminal Condition 




Exogenous Function 


7t{G) = Gross Profit Rate 


Parameters 


6 = Goodwill Decay Constant 
p — Discormt Rate 

Q — Upper Boimd on Advertising Rate 
Gq = Initial Goodwill Level 



Table 1.2: The Advertising Model of Example 1.2 



The objective function J requires special discussion. Note that the 
integral defining J is from time t = 0 to time t = oo; we will later call 
a problem having upper time limit of oo, an infinite horizon problem. 
Because of this upper limit, the integrand of the objective function in- 
cludes the discount factor where p > 0 is the (constant) discount 
rate. Without this discount factor, the integral would (in most cases) 
diverge to infinity. Hence, we will see that such a discount factor is an 
essential part of infinite horizon models. The rest of the integrand in 
the objective fimction consists of the gross profit rate 7r(G(t)), which 
results from the goodwill level G{t) at time t less the cost of advertising 
assumed to be proportional to u(t) (proportionality factor = 1); thus 
7c{G(t)) —u{t) is th^ net profit rate at time t. Also [7T{G{t)) —u{t)]e~^^ is 
the net profit rate at time t discounted to time 0, i.e., the present value 
of the time t profit rate. Hence, J can be interpreted as the total value of 
discounted future profits, and is the quantity we are trying to maximize. 

There are control constraints 0 < u{t) < Q, where Q is the upper 
bound on the advertising level. However, there is no state constraint. It 
can be seen from the state equation and the control constraints that the 
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goodwill G{t) in fact never becomes negative. 

You will find it instructive to compare this model with the previous 
one and note similarities and differences. 

Example l.S A Consumption Model. Rich Rentier plans to retire at age 
65 with a lump sum pension of Wq dollars. Rich estimates his remaining 
life span to be T years, and he wants to consume his wealth during 
these retirement years and leave a bequest at time T in a way that will 
maximize his total utility of consumption and bequest. 

Since he does not want to take investment risks, Rich plans to put 
his money into a savings account that pays interest at a continuously 
compounded rate of r. If we let the state variable W(t) denote the 
wealth at time t and the control variable C(t) the rate of consumption 
at time t, it is easy to see that the state equation is 

W{t) = rW{t) - C{t), 

with the initial condition IY(0) = Wq. Letting U(C) be the utility 
function of consumption C and B{W) be the bequest fimction of leaving 
a bequest of amount W at time T, we see that the problem can be stated 
as an optimal control problem with the various variables, equations, and 
constraints shown in Table 1.3. 

Note that the objective function has two parts: first the integral of 
the discounted utihty of consumption from 0 to T with p as the discount 
rate; and second the bequest function B{W)^ which measures Rich’s 
discounted utility of leaving an estate W to his heirs at time T. If he has 
no heirs and does not care about charity, then B{W) = 0. However, if he 
has heirs or a favorite charity to whom he wishes to leave money, then 
B{W) measures the strength of his desire to leave an estate of amoimt 
W. The nonnegativity constraints on state and control variables are 
obviously natural requirements that must be imposed. 

You will be asked to solve this problem in Exercise 2.1 after you 
have learned the maximum principle in the next chapter. Moreover, a 
stochastic extension of the consumption problem, known as a consump- 
tion/investment problem will be discussed in Section 13.5. 



1.3 History of Optimal Control Theory 

Optimal control theory is an extension of the calculus of variations (see 
Appendix B), so we discuss the history of the latter first. 
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State Variable 


w{t) = Wealth 


Control Variable 


C{t) ~ Consumption Rate 


State Equation 


wit) = rWlt) - Clt), W(0) = Wo 


Objective Function 


Max Y = f U{C{t))e-f”^dt + B[W(T)]e“'’^ i 


State Constraint 


W{t) > 0 


Control Constraint 


C{t) > 0 


Terminal Condition 




Exogenous 


U{C) = Utility of Consumption 


Functions 


B(W) = Bequest Function 


Parameters 


T = Terminal Time 
Wq = Initial Wealth 
p = Discount Rate 
r = Interest Rate 



Table 1.3: The Consumption Model of Example 1.3 



The creation of the calculus of variations occurred almost immedi- 
ately after the invention or formalization of calculus by Newton and 
Leibniz. An important problem in calculus is to find an argument of 
a function at which the function takes on its maximum or minimum. 
The extension of this problem posed in the calculus of variations is to 
find a function which maximizes or minimizes the value of an integral 
or fimctional of that function. As might be expected, the extremum 
problem of the calculus of variations is much harder than the extremum 
problem of the calculus. Euler and Lagrange are generally considered 
to be the founders of the calculus of variations. Others who contributed 
much to the early development of the field include some of the greatest 
mathematicians such as Newton, Legendre, and the Bernoulli brothers. 

A celebrated problem first considered by the calculus of variations 
was the path of least time or the Brachistochrone problem. The problem 
is illustrated in Figure 1.1. It involves finding a curve T connecting the 
two points (0,0) and (1,1) in the vertical plane with the property that 
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(0,0) (1,0) 




a bead sliding along the curve under the influence of gravity will move 
from (0,0) to (1,1) in shortest possible time. The problem was posed 
by Johann Bernoulh in 1696 and solved by him and his brother Jacob 
Bernoulh independently in 1697. In Appendix B.4, we provide a solution 
to the Brachistochrone problem and show that the form of the solution 
curve is a cycloid. 

In the nineteenth and early twentieth centuries, many mathemati- 
cians contributed to the theory of the solution of calculus of varia- 
tions problems. These include Hamilton, Jacobi, Bolza, Weierstrass, 
Caratheodory, and Bliss. See Pesch and Buhrsch (1994) for a historical 
perspective on the subject of calculus of variations. 

Converting calculus of variations problems into control theory prob- 
lems requires one more conceptual extension — the addition of control 
variables to the state equations. Rufus Isaacs (1965) made such an ex- 
tension in two-person pursuit-evasion games in the period 1948-1955. 
Richard Bellman (1957) made a similar extension with the idea of dy- 
namic programming. 

The starting date of modern control theory was the publication 
in Russian in 1958 (1962 in English) of the book. The Mathematical 
Theory of Optimal Processes, by Pontryagin, Boltyanskii, Gamkrelidze, 
and Mischenko (1962). Well-known American mathematicians associ- 
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ated with the maximum principle include Valentine, McShane, Hestenes, 
Berkovitz, and Neustadt. The importance of the book by Pontryagin et 
al. lies not only in a rigorous formulation of a calculus of variations 
problem with constrained control variables, but also in the proof of the 
maximum principle for optimal control problems. The maximum princi- 
ple permits the decoupling of the dynamic problem over time using what 
are known as adjoint variables or shadow prices into a series of problems 
each of which holds at a single instant of time. The optimal solution of 
the instantaneous problems can be shown to give the optimal solution 
to the overall problem. 

In this book we wiU be concerned principally with the application of 
the maximum principle in its various forms to find the solutions of a wide 
variety of applied problems in management science and economics. It is 
hoped that the reader, after reading some of these problems and their 
solutions, will appreciate, as we do, the importance of the maximum 
principle. 

Some important books and siu:veys of the applications of the maxi- 
mum principle to management science and economics are Connors and 
Teichroew (1967), Arrow and Kurz (1970), Hadley and Kemp (1971), 
Bensoussan, Hurst, and Naslund (1974), Stoppler (1975), Clark (1976), 
Sethi (1977a, 1978a), Tapiero (1977), Wickwire (1977), Bookbinder and 
Sethi (1980), Lesourne and Leban (1982), Tu (1984), Feichtinger and 
Hartl (1986), Carlson and Haurie (1987), Seierstad and Sydsseter (1987), 
Tapiero (1988), Erickson (1991), Leonard and Long(1992), Van Hilten, 
Kort, and Van Loon (1993), Feichtinger, Hartl, and Sethi (1994), Kamien 
and Schwartz (1998), Maimon, Khmelnitsky, and Kogan (1998), and 
Dockner, J0rgensen, Long, and Sorger (2000). Nevertheless, we have 
included in our bibliography many works of interest. 



1.4 Notation and Concepts Used 

In order to make the book readable, we shall adopt the following notation 
which wiU hold throughout the book. In addition, we shall define some 
important concepts that are required, including those of concave, convex 
and affine frmctions, and saddle points. 

We use the symbol to mean “is equal to” or “is defined to be 
equal to” or “is identically equal to” depending on the context. The 
symbol means “is defined to be equal to,” the symbol “—” means 
“is identically equal to,” and the symbol “j^” means “is approximately 
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equal to.” The double arrow means “implies” and means “is a 
member of.” The symbol □ indicates the end of a proof. 

Let y be an rz^component column vector and 2 be an m-component 
row vector, i.e., 



y 



I..] 



V2 



= (yi,...,yn)'^ andz = (zi,...,Zm), 



\Vn J 



where a superscript T on a vector (or, a matrix) denotes the transpose 
of the vector (or, the matrix). If y and z are functions of time t, a scalar, 
then the time derivatives y := dy/dt and i := dz/dt are defined as 

y = ^ = and i = — = (ii,...,i^), 

where yi and Zj denote the time derivatives dyi/dt and dzj/dt., respec- 
tively. 

When n = m, we can define the inner product 



= Sfci2i3/i- 



(1.7) 



More generally, if 



A. — aij — 



/ \ 

ail ^12 • • • a\k 

a21 U22 • • • U2fc 



O'ml ®m2 ' ‘ ‘ ^mk 



is an m X matrix and B = {bij} is a kxn matrix, we define the matrix 
product C = {cij} = AB^ which is an m x n matrix with components 




12 



1. What is Optimal Control Theory? 



Let denote the /c-dimensional Euclidean space. Its elements are 
/c-component vectors, which may be either row or column vectors, de- 
pending on the context. Thus in (1.7), y € EJ^ is a column vector and 
2 ; ^ is a row vector. 



Differentiating vectors and matrices with respect to scalars 



Let f : ^ he a fe-dimensional function of a scalar variable t. 

If / is a row vector, then we define 

^ = /t = (/u, / 2 t, • ■ • , /ftt), a row vector. 

We will also use the notation f' = (/i, / 2 ? • ' ' 1 /*) and f'(t) in place of 
If / is a colmnn vector, then 



^ = f = 
dt 



ht 



Y fkt j 



= {fit, f 2 t, • • • , a column vector. 



Once again, ft may also be written as f' or / (t). 

A similar rule applies if a matrix function is differentiated with re- 
spect to a scalar. 

Diflferentiating scalars with respect to vectors 

If F(y, is a scalar function of vectors y and 2r, i.e., F : E'^ x E'^ 
n > 2, m > 2, then the gradients Fy and Fz are defined, respectively, 
as 

= {^y\ r-‘,Fy^), a row vector, (1.9) 

and 



/ 












•21 



\ / 



, a column vector, 



( 1 . 10 ) 
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where Fy^ and Fz^ denote the partial derivatives with respect to the 
subscripted variables. Recall that y is a column vector and 2 : is a row 
vector. Thus, our convention is that the gradient with respect to a 
column vector is a row vector and the gradient with respect to a row 
vector is a column vector. 



Differentiating vectors with respect to vectors 

II f : X E'^ — > E* is a fc-dimensional vector function, f either row 

or column, k > 2, i.e., 

/ = (/i, • • * , /fc) or / = (/i, - • • , 



where each component fi = fi(y, z) depends on the column vector y £ EF 
and the row vector z G E^, n > 2, m > 2, then fz will denote the kxm 
matrix 





! 

df\/dzi, 


dfi/dz2, ■ • • 


dfijdzm 






fz = 


dh/dzu 


df2/dz2, • • • 


df2ldZm 


= {dfi/dzj], 


(1.11) 




^ dfkfdzi, 


9fk/dz2, • • • 


OfkIdZm ^ 






and fy will denote the k x n matrix 










^ dfi/dyi 


9fil9y2 • • • 


^fi/dyn ^ 






fy = 


df2ldyi 


Qf‘il^V2 • • • 


df2/dyn 


= {dfi/dyj}. 


(1.12) 




^ dfkldyi 


9fkl9y2 


dfk/dyn y 







Matrices fz and fy are known as Jacobian matrices. It should be empha^ 
sized that the rule of defining a Jacobian does not depend on the row or 
column nature of the fmiction and its arguments. Thus, 



h=fI^f.r=fjT, 
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where by /J we mean {f^)z and not (fz)^- 

Applying the rule (1.11) to Fy in (1.9) and the rule (1.12) to Fz in 
(1.10), respectively, we obtain Fyz = {Fy)z to be the nx m matrix 



Fyz ~ 





^y\Z2 


^yiZm 




^y2Z2 




^ynZ\ 


^ynZ2 


^VnZm 



' d‘^F \ 
dyidzj I ’ 



(1.13) 



and Fzy — (Fz)y to be the mxn matrix 



Fzy — 





F 

^ziy2 


F 

^Ziyn 


^Z2yi 


^Z2y2 


^Z2yn 


^Zmyi 


^ZmV2 ' 


^Zmyn 



d‘^F 

dzidyj 



(1.14) 



Product rule for differentiation 

Let g be an n-component row vector function and f he an rb- 
component column vector function of an n-component vector x. Then 
in Exercise 1.8, you are asked to verify the following identity when x is 
a column vector: 



(s/)x = gfx + fgl = gfx + fgx- (1-15) 



In Exercise 1.9, you are asked to show further that with g — Fx, where 
X e E^,n > 2, and the function F : BF —*■ is twice continuously 
differentiable so that Fxx = (Fxx)'^, then 

{gf)xHPx:f)x=F^fx + + {Fxxff- (1-16) 

The latter result will be used in Chapter 2 for the derivation of (2.25). 
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Many mathematical expressions in this book will be vector equations 
or inequalities involving vectors and vector fimctions. Since scalars are 
a special case of vectors, these expressions hold just as well for scalar 
equations or inequalities involving scalars and scalar fimctions. In fact, 
it may be a good idea to read them as scalar expressions on the first 
reading. Then in the second and further readings, the extension to vector 
form will be easier. 

The norm of an m-component row or column vector 2 : is defined to 
be 

II ^ 11 = + + (1-17) 

The norm of a vector is commonly used to define a neighborhood Nzq of 
a point, e.g., 

^zo = {z\ \\ z- zo\\<e}, (1.18) 

where £ > 0 is a small positive real number. 

We shall occasionally make use of the so-caUed “iittle-o” notation 
o{z). A function F{z) : — >■ is said to be of the order o(z), if 



lim p4 = 0. 

I^IHO II ^ II 



The most common use of this notation will be to collect higher order 
terms in a series expansion. 

In the continuous-time models discussed in this book, we generally 
will use x{t) to denote the state (column) vector, u{t) to denote the 
control (column) vector, and X{t) to denote the adjoint (row) vector. 
Whenever there is no possibihty of confusion, we will suppress the time 
indicator (t) from these vectors and write them as x, u, and A, respec- 
tively. When talking about optimal state, control, and adjoint vectors, 
we put an asterisk as a superscript, i.e., as x*,u*, and A*, respec- 
tively. 

The norm of an m-dimensional row or column vector function z{t ) , 
t € [0, T], is defined to be 







(1.19) 



In Chapter 4 and some other chapters, we shall encounter functions 
of time with jumps. For such functions, it is useful to have the concepts 
of left and right limits. These are defined, respectively, as 




16 



1. What is Optimal Control Theory? 



x{t ) = — e) and x(t^) = l\mx(t + e). (1-20) 

eiO £|0 



In the discrete-time models introduced in Chapter 8 and applied in 
Chapter 9, we use x^,u^, and A* to denote state, control, and adjoint 
variables, respectively, at time k, fc = 0,l,2,...,T. We also denote the 
difference operator by 

Ax^ x^+^-xK 



As in the continuous-time case, the optimal variables have an asterisk as 
a superscript; thus and A** denote quantities along an optimal 

path. 

In order to specify the optimal control for linear control problems, 
we wiU introduce a special notation, called the bang function, as 



bang[6i,62; W] = 



bi if W < 0, 

undefined if W = 0, 

h if > 0. 



( 1 . 21 ) 



In order to specify the optimal control for linear-quadratic problems, 
we define another special function, called the sat function, as 








2/1 


if 


W < yi. 




sat[2/i,2/2;IFj = < 


W 


if 


yi<W < 


s/2, (1-22) 




2/2 


if 


W > 1/2. 




The word “sat” is short for the word ‘ 


‘saturation.” 


The latter name 



comes from an electrical engineering application to saturated amplifiers. 

In several applications to be discussed, we wiU need the concept of 
impulse control, which is sometimes needed in cases when an imbounded 
control can be apphed for a very short time. An example is the advertis- 
ing model in Table 1.2 when Q = oo. We apply unbounded control for a 
short time in order to cause a jump discontinuity in the state variable. 
For the example in Table 1.2, this might mean an intense advertising 
campaign (a media blitz) in order to increase advertising goodwill by a 
finite amount in a very short time. The impulse fimction defined be- 
low is required to evaluate the integral in the objective function, which 
measures the cost of the intense advertising campaign. 
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Suppose we want to apply an impulse control at time t to change the 
state variable from x{t) = x\ to the value X 2 “immediately” after t, i.e., 
= X 2 . To compute its contribution to the objective function (1.2), 
we use the following procedure: Given £ > 0 and a constant control n(e), 
integrate (1.1) from t to t + £ with x{t) = x\ and choose u{e) so that 
x{t e) = X 2 ’i this gives the trajectory x(r; £^u(e)) for r E [t, t + e]. We 
can now compute 

rt-\-e 

imp(xi, 0 : 2 , t) = lime ->o J F{x^u,T)dr. (1.23) 

If the impulse is applied only at time t, then we can calculate (1.2) as 

J — J F(x,u,r)dr -\-imp(xi,X 2 ^t) 4- F(x,u,r)dr + 5[i(r),T]. 

(1.24) 



If there are several instants at which impulses are applied, then this 
procedure is easily extended. Examples of the use of (1.24) occur in 
Chapters 5 and 7. We frequently omit t in (1.23) when the impulse 
function is independent of t. 

Convex set and convex hull 

A set C is a convex set if for each pair of points y,z £ the 
entire line segment joining these two points is also in D, i.e., 

py 4- (1 ~ p)z E D, for each p E [0, 1]. 

Given E FF.,i ~ 1,2, ...,/, we define y E to be a convex 
combination of x^ E E'^, if there exists pi >0 such that 

i i 

'^Pi = 1 and y = ^Pix\ 
i=l i=l 

The convex hull of a set C E^ is 

{ 1 i 

'^Pix'’ : Y^pi = 1, Pi >0, x^ G F, i = 1,2,... J 

i^l i=i 

In other words, coD is the set of all convex combinations of points in D. 
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Concave and convex functions 

A real- valued function -0 defined on a convex set D C E'^, i.e., ^ : 
D is concave^ if for each pair of points y^z G D and for all 

pe [0,1], 

i’ipy + (1 - p)z) > p4>{y) + (i - 

If the inequalities in the above definition are strict for all y^z G D with 
y ^ z, and 0 < p < 1, then ^ is called a strictly concave function. 

In the single dimensional case of n = 1, there is an enlightening 
geometrical interpretation. Namely, ^p{x) defined on an interval D = 
[a, 6] is concave if, for each pair of points on the graph of 'ip{x), the fine 
segment joining these two points lies entirely below or on the graph of 
^p{x); see Figure 1.2. 







Figure 1.2: A Concave Function 

If 'ip{x) is a differentiable function on the interval [a, 6], then it is 
concave, if for each pair of points y,z G [a, 6], 

Furthermore, if the function 'ip is twice differentiable, then it is concave, 
if at each point in [a,b]. 



■>Pxx < 0 - 
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Finally, if tp : D defined on a convex set D C is a concave 

function, then the negative of the function ip, i.e., —'ip : D E^^ is a 
convex function. 

Affine function and homogeneous function of degree one 

A function 'ip : E'^ -4 E^ is said to be affine, if 'tp{x) — '0(0) is linear. 

A function ^p : E'^ — > E^ is said to be homogeneous of degree one, if 
ip(hx) — h'tp(x), where 6 is a scalar constant. 

Saddle point 

Let 'ip(x,y) he a real- valued function defined on the space E~^ x E'^, 
i.e., ip : E'^ X E^ E^. A point {x,y) ^ EE' x E^ is called a saddle 
point oi ^p{x,y), if 

^ < '4’i^^y) for all x £ E'^ and y £ E'^. 

Note that a saddle point may never exist, and even if it exists, it may 
not be imique. Note also that 

'ip{x,y) = max' 0 (a:, y) = mm'ip{x, y). 

X y 

Intuitively, this could produce a picture like a horse saddle as shown in 
Figure 1.3. Hence, the name saddle point for a point Mke S in Figure 

1.3. 

The concept of a saddle point is very important in game theory, and 
it is encoimtered in Section 12.1. 




Figure 1.3: An Illustration of a Saddle Point 
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Linear independence and rank of a matrix 

A set of vectors ai, a 2 , . . . , from E'^ is said to be linearly dependent 

if there exist scalars pi not all zero such that 

n 

J2piai = 0. ( 1 . 25 ) 

i=l 

If the only set of pi for which (1.25) holds is pi = p 2 = ...= Pn = 
then the vectors are said to be linearly independent. 

The rank (or more precisely the column rank) of an m x n matrix 
A, written rank(A), is the maximum number of linearly independent 
columns in A. 

An m X n matrix is of full rank if 

rank(A) = n. 



EXERCISES FOR CHAPTER 1 

1.1 In Example 1.1, let the functions and parameters of the production- 
inventory model be given by: 

h{I) = 107, c(P) = 20P, T = 10, Jo = 1, 000 

^min = 600, Pniax = 1200, = 800, S{t) = 900 + lOt 

(a) Set P(t) = 1000 for 0 < t < 10. Determine whether this control 
is feasible; if it is feasible, compute the value J of the objective 
function. 

(b) If P{t) = 800, show that the terminal constraint is violated and 
hence the control is infeasible. 

(c) If P{t) — Pxnxjx lor 0 < i < 6 and P{t) = Pmax for 6 < t < 10, 
show that the control is infeasible because the state constraint is 
violated. 

1.2 For the advertising model in Example 1.2, let 'k{G) = 2y/G^6 = 
0.05, p = 0.2, Q = 2, and Gq = 16. Set u{t) = 0.8 for t > 0, and 
show that G{t) is constant for all t. Compute the value J of the 
objective function. 
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1.3 Rich Rentier in Example 1.3 has initial wealth Wq = $1,000,000. 
Assume B = 0, p = 0.1, r = 0.15, and assume that Rich expects to 
live for exactly 20 years. 

(a) What is the maximum constant consumption level which Rich 
can afford druing his remaining life? 

(b) If Rich’s utility function is U{C) = InC, what is the present 
value of the total utility in part (a)? 

(c) Suppose Rich sets aside $100,000 to start the Rentier Founda- 
tion. What is the maximum constant grant level which the foun- 
dation can support if it is to last forever? 

1.4 Suppose Rich in Exercise 1.3 takes on a part-time job, which yields 
an income of y{t) at time t. Assume y{t) = 10, 000e~® ®^^ and that 
he has a bequest function B{W) = O.SlnVK 

(a) Reformulate this new optimal control problem. 

(b) If Rich (no longer a rentier) consumes at the constant rate 
found in Exercise 1.3(a), find his terminal wealth and his new total 
utility. 

1.5 In Example 1.1, suppose there is a cost associated with changing 
the rate of production. One way to formulate this problem is to let 
the control variable u(t) denote the rate of change of the production 
rate P{t), having a cost cu^ associated with such changes, where 
c> 0, Formulate the new problem. 

[Hint: Let P(t) be an additional state variable.] 

1.6 In Example 1.2, suppose G measures the mnnber of people who 
know about the product. Hence, if A is the total population, then 
A — G is the number of people who do not know about the product. 
If u{t) measures the advertising rate at time assume that u(A—G) 
is the corresponding rate of increase of G due to this advertising. 
Formulate the new model. 

1.7 Consider the following educational policy question. Let S{t) denote 
the total number of scientists at time t, and let 6 be the retirement 
rate of scientists. Let E{t) be the mnnber of teaching scientists and 
R{t) be the number of research scientists, so that S{t) — E{t) -f 
R{t). Assume ')E{t) is the number of newly graduated scientists at 
time i, of which the policy allocates u'yE{t) to the pool of teachers, 
where 0 < u < 1. The remaining graduates are added to the pool 
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of researchers. The government has a target of maximizing the 
function aE{T) + j3R{T) at a given future time T, where a and /? 
are positive constants. Formulate the optimal control problem for 
the government. 

1.8 Let X € E'^ be a column vector with n > 2, let ^ be an ?>-component 
row vector function of and let / be an 7^-component column 
vector function of x. Use the ordinary product rule of calculus to 
derive the formula 

{gf)x = gfx + = gfx + fgx- 

For the sake of completeness, we note that when x is a row vector, 
then {gf)^ = + g^f = + gj. 

1.9 Let X G E^ be a column vector with n > 2 and g = Fx, where 
F : E'^ -4 E^ is twice continuously differentiable. Use Exercise 1.8 
to show that 

{gf)x={Fxf)x=Fxfx + f'^Fxx = Fxfx + (Fxxf)'^’ 

[Hint: F being twice continously differentiable implies Fxx = 

(Fxxf]. 

1.10 Use the bang function defined in (1.21) to sketch the optimal con- 
trol 

u*{t) = bang[— 1, l-,W{t)] for 0 < i < 5, 

when 

(a) W{t) :=t-2 

(b) W{t) =t^-4tF3 

(c) W{t) = sinTrt. 

1.11 Use the sat function defined in (1.22) to sketch the optimal control 

u*(t) = sat[2, 3; VU(t)] for 0 < t < 5, 

when 

(a) W{t) - 4 - t 

(b) W{t)=2-\-F 

(c) W{t) = 4 - 4e~K 

1.12 Evaluate the function imp(Gi^G2,t) for the advertising model of 
Table 1.2 when G 2 > G\^ Q = oo, and 7 t(G) = pG, where p is a 
constant. 




Chapter 2 

The Maximum Principle: 
Continuous Time 



The main purpose of this chapter is to introduce the maximum principle 
as a necessary condition that must be satisfied by any optimal control 
for the basic problem specified in Section 2.1. Although vector notation 
is used, the reader can think of the problem as having only a single state 
variable and a single control variable on the first reading. In Section 2.2, 
the method of dynamic programming is used to derive the maximum 
principle. We use this method because of the simplicity and familiarity 
of the dynamic programming concept. The derivation also yields signifi- 
cant economic interpretations. (In Appendix C, the maximmn principle 
is also derived by using a more general method similar to that of Pontrya- 
gin et al. (1962), but with certain simplifications.) In Section 2.4, the 
maximum principle is shown to be sufficient for optimal control imder 
an appropriate concavity condition, which holds in many management 
science applications. 



2.1 Statement of the Problem 

Optimal control theory deals with the problem of optimization of dy- 
namic systems. The problem must be well posed before any solution 
can be attempted. This requires a clear mathematical description of the 
system to be optimized, the constraints imposed on the system, and the 
objective fimction to be maximized (or minimized). 




24 



2. The Maximum Principle: Continuous Time 



2.1.1 The Mathematical Model 

An important part of any control problem is the process of modeling the 
dynamic system, physical, business, or otherwise, under consideration. 
The aim is to arrive at a mathematical description which is simple enough 
to deal with, and realistic enough to be able to predict the response of 
the system to any given input. The model of the system for our purpose 
is restricted to systems that can be characterized by a system of ordinary 
differential equations (or, ordinary difference equations in the discrete- 
time case treated in Chapter 8). Thus, given the initial state xq of the 
system and control history u(t),t G [0, T], of the process, the evolution 
of the system may be described by the first-order differential equation, 
known also as the state equation^ 

x{t) = f(x(t), u{t), t), a:(0) = xq, (2.1) 

where the vector of state variables^ x{t) G E'^, the vector of control 
variables, u{t) G E'^, and f : E^ x E'^ x E^ ^ E^. Furthermore, the 
function / is assumed to be continuously differentiable. Here we assume 
X to be a column vector and / to be a column vector of functions. The 
path x(t),t G [0, T], is called a state trajectory and u{t), t G [0,T], 
is called a control trajectory or simply, a control The terms, vector 
of state variables, state vector, and state will be used interchangeably; 
similarly for the terms, vector of control variables, control vector, and 
control. As mentioned earlier, when no confusion arises, we will usually 
suppress the time notation {t)\ thus, e.g., x[t) will be written simply 
as X. Furthermore, whether x denotes the state at time t or the entire 
state trajectory should be inferred from the context. A similar statement 
holds for u. 

2.1.2 Constraints 

In this chapter, we are concerned with problems which do not have state 
constraints of types (1.4) and (1.5). Such constraints are considered in 
Chapters 3 and 4, as indicated in Section 1.1. We do impose constraints 
of type (1.3) on the control variables. We define an admissible control 
to be a control trajectory u{t), t G [0,T], which is piecewise continuous 
and satisfies, in addition, 

u(t) Gn{t) CE^, tG[0,T]. (2.2) 

Usually the set Q(t) is determined by physical or economic constraints 
on the values of the control variables at time t. 
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2.1.3 The Objective Function 

An objective function is a quantitative measure of the performance of 
the system over time. An optimal control is defined to be an admissible 
control which maximizes the objective function. In business or economic 
problems, a typical objective function gives some appropriate measure 
of quantities such as profits, sales, or negative of costs. Mathematically, 
we let 

J = r F{x{t), u{t), t)dt + S[x{T),T] (2.3) 

Jo 

denote the objective function, where the functions F : E'^ x x E^ -4 
E^ and S : E^ x E^ ^ E^ are assumed for our purposes to be contin- 
uously differentiable. In a typical business application, F(x^ w, t) could 
be the instantaneous profit rate and <S'[x, T] could be the salvage value of 
having x as the system state at the terminal time T. 



2.1.4 The Optimal Control Problem 

Given the preceding definitions we can state the optimal control problem 
with which we will be concerned in this chapter. The problem is to find 
an admissible control u*, which maximizes the objective function (2.3) 
subject to the state equation (2.1) and the control constraints (2.2). We 
now restate the optimal control problem as: 

r Fix, u,t)dt-irS[x(T),T] 

Jo 

subject to 

X — f{x,U,t), x(0) = Xq. 

The control u* is called an optimal control and x*, determined by means 
of the state equation with u = u* , is called the optimal trajectory or an 
optimal path. The optimal value of the objective function will be denoted 
as J{u*) or J*. 

The optimal control problem (2.4) specified above is said to be in 
Bolza form because of the form of the objective function in (2.3). It is 
said to be in Lagrange form when S = 0. We say the problem is in Mayer 
form when F = 0. Furthermore, it is in linear Mayer form when F = 0 



max 

u(t)en(t) 
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and S is linear, i.e., 

✓ 

max {J = cx 
u(t)e^2(t) 

^ subject to 

± = f(x,u,t), x{0) = Xo, 

where c = (ci, C 2 , • • • , c„) is an Tvdimensional row vector of given con- 
stants. In the next paragraph and in Exercise 2.3, it will be demonstrated 
that aU of these forms can be converted into the linear Mayer form. 

To show that the Bolza form can be reduced to the linear Mayer 
form, we define a new state vector y = (i/i, 2 / 2 , . . . , 2 /^+ 1 )? having n + 1 
components defined as follows: ?/* = x* for z = 1, . . . , n and 

= F{x,u,t) + = 0. (2.6) 

We also put c= (0, • • • , 0, 1), where c has n-\-\ components, so that the 
objective function is J = cy(T) = ?/n+i(T'). If we now integrate (2.6) 
from 0 to T, we have 

J^cy{T)=yr,+i{T)= T F(x,i/, + 5[x(T), T], (2.7) 

7o 

which is the same as the objective function J in (2.4), Of course, the 
price paid for going from Bolza to linear Mayer form is the addition of 
one state variable and its associated differential equation (2.6). 

Exercise 2.3 poses the question of showing in a similar way that the 
Lagrange and Mayer forms can also be reduced to the linear Mayer form. 

In Section 2.2, we derive necessary conditions for optimal control in 
the form of the maximum principle, and in Section 2.4 we derive sufficient 
conditions. In any particular application, the existence of a solution will 
be demonstrated by actually finding a solution that satisfies both the 
necessary and the sufficient conditions for optimality. We thus avoid the 
necessity of having to prove general existence theorems, which require 
advanced and difficult mathematics. Nevertheless, interested readers can 
consult Hartl, Sethi, and Vickson (1995) for a brief discussion of existence 
results and references therein including Cesari (1983) for further details. 
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2.2 Dynamic Programming and the Maximum 
Principle 

We shall now derive the maximum principle by using a dynamic pro- 
gramming approach. The proof is intuitive in nature and is not intended 
to be mathematically rigorous. For more rigorous derivations, we refer 
the reader to Appendix C, Pontryagin et al. (1962), Berkovitz (1961), 
Halkin (1967), Boltyanskii (1971), Hartberger (1973), Bryant and Mayne 
(1974), Leitmann (1981), and Seierstad and Sydsaeter (1987). Additional 
references can be found in the survey by Hartl, Sethi, and Vickson (1995). 
For maximmn principles for more general optimal control problems in- 
cluding those with nondifferentiable functions, see Clarke (1996, 1983, 
1989). 

2.2.1 The Hamilton-Jacobi-Bellman Equation 

Suppose V (x, t) : E'^ x is a function whose value is the maxi- 

mum value of the objective function of the control problem for the sys- 
tem, given that we start it at time t in state x. That is, 

V(x,t)= max [ F{x{s)^u{s),s)ds -F S{x{T),T) , (2.8) 

u(s)eQ(s) Jt 

where for s >t, 

— = f{x{s),uis),s), x{t) = x 

We initially assume that the value function V(x, t) exists for all x and t 
in the relevant ranges. Later we wiU make additional assumptions about 
the frmction V{x,t). 

Richard Belhnan (1957) in his book on dynamic programming states 
the principle of optimality as follows; 

An optimal policy has the property that, whatever the 
initial state and initial decision are, the remaining decision 
must constitute an optimal policy with regard to the outcome 
resulting from the first decision. 

Intuitively this principle is obvious, for if we were to start in state x 
at time t and did not foUow an optimal path from there on, then there 
would exist (by assumption) a better path from t to T, hence we could 
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Figure 2.1: An Optimal Path in the State-Time Space 



improve the proposed solution by following the better path from time t 
on. We will use the principle of optimality to derive conditions on the 
value function V{x^t). 

Figure 2.1 is a schematic picture of the optimal path x*(t) in the 
state-time space, and two nearby points (x, t) and (x -\-Sx,t + 6t), where 
St is a small increment of time and x-\-Sx = x{t + St). The value function 
changes from V(x,t) to V{x -\- Sx^t + St) between these two points. By 
the principle of optimality, the change in the objective fimction is made 
up of two parts: first, the incremental change in J from ttot-\-St^ which 
is given by the integral of F(x.^ w, t) from t to t + St] second, the value 
function V{x-\-Sx.,tTSt) at time t-\-St. The control actions u{r) should 
be chosen to lie in $l(r), r G [i, i + St], and to maximize the sum of these 
two terms. In equation form this is 

f ft+8t ) 

V{x,t)= max < / F[xiT),u(T),T\dr + V[x(t-\- St),t -\- St]\ , 

w(r)Gf2(T) I Jt j 



(2.9) 
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where 6t represents a small increment mt. It is instructive to compare 
this equation to definition (2.8). 

Since F is a continuous function, the integral in (2.9) is approximately 
F{x,u,t)6t so that we can rewrite (2.9) as 

V{x,t)= max {F{x,u,t)6t -{-V[x(t F St), t 6t]} -\- o{6t), (2.10) 

ueo(t) 

where o(St) denotes a collection of higher-order terms in St, (By defini- 
tion given in Section 1.4, o(St) is a fimction such that lim^jt^o = 0)- 
We now make an assumption of which we will talk more later. We 
assume that the value function K is a continuously differentiable function 
of its arguments. This allows us to use the Taylor series expansion of V 
with respect to St and obtain 

V[x{t + St),t + St] = V{x,t) + [Vx(x,t)x + Vt{x,t)]St + o{St), (2.11) 

where and Vf are partial derivatives of V{x,t) with respect to x and 
t, respectively. 

Substituting for x from (2.1) in the above equation and then using it 
in (2.10), we obtain 

V(x,t) = max {F(x,u,t)St-h V(x,t) -hVx(x,t)f(x,u,t)St 

ueQ(t) 

+ Vt(x,t)St}-ho(St), (2.12) 

Canceling V (x, t) on both sides and then dividing by St we get 

0 = max {F(x, u, t) + Vx(x, t)f{x, u, t) + Vt{x, t)} + (2.13) 

iteo(i) ot 

Now we let St ^ and obtain the following equation 

0= max {F{x,u,t) -\-Vx(x,t)f{x,u,t) pVt{x,t)} , (2.14) 

ueo.{t) 

with the boundary condition 

V(x,T) = S(x,T). (2.15) 

That this boundary condition must hold follows from the fact that at 
t = T, the value function is simply the salvage value function. 

Note that the components of the vector Vx{x,t) can be interpreted 
as the marginal contributions of the state variables x to the objective 
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function being maximized. We denote the marginal return vector (along 
the optimal path x*(t)) by the adjoint (row) vector A(t) G i.e., 

m = t) ■.= V,(x, t) . (2.16) 

Prom the preceding remark, we can also interpret \(t) as the per unit 
change in the objective function for a small change in x*{t) at time t; 
see Section 2.2.4. Next we introduce the so-called Hamiltonian 

H[x, w, Vx, t] = F{x, u, t) P Vx(x, t)f{x, u, t) (2.17) 

or, simply, 

H[x, u, A, t) = F(x, li, t) + A/(x, u, t), (2.18) 

We can rewrite equation (2.14) as the following equation, 

0 = max [H{x, u, Vx,t) + Vt], (2.19) 

called the Hamilton- Jacobi-Bellman equation or, simply, the HJB equa- 
tion. 

Note that it is possible to take Vt out of the maximizing operation 
since it does not depend on u. 

The Hamiltonian maximizing condition of the maximum principle 
can be obtained from (2.19) and (2.16) by observing that, if x*{t) and 
u*(t) are optimal values of the state and control variables and \{t) is the 
corresponding value of the adjoint variable at time then the optimal 
control u*{t) must satisfy (2.19), i.e., for all w G f^(t), 

H[x*{t),u*{t),\{t)H] + Vt{x*{t)H) > H[x*{t),u,\{t),t] 

^Vt{x*(t),ty ( 2 . 20 ) 

Canceling the term Vt on both sides, we obtain 

H[x*{t),u*{t)A{t)A > H[x*{t),u,X{t)H] ( 2 . 21 ) 

for aU t£ G O(^). 

In order to complete the statement of the maximum principle, we 
must still obtain the adjoint equation. 
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2.2.2 Derivation of the Adjoint Equation 

The derivation of the adjoint equation proceeds from the HJB equation 
(2.19), and is similar to those in Fel’dbaum (1965) and Kirk (1970). 
Note that, given the optimal path x*, the optimal control u* maximizes 
the right-hand side of (2.19), and its maximum value is zero. We now 
consider small perturbations of the values of the state variables in a 
neighborhood of the optimal path x*. Thus, let 

x{t)=x*{t)-\-6x{t), (2.22) 

where 6x(t), || 6x{t) ||< £ for a small positive s, be such a perturbation. 

We now consider a ‘fixed’ time instant t. We can then write (2.19) 
as 



H[x*{t),u*{t), V,(x*{t), t),t]+ Vt{x*{t), t) 

> H[x{t),u*{t),Vx{x{t),t),t\ ^-Vt{x{t),t). (2.23) 

To explain, we note from (2.19) that the left-hand side of (2.23) equals 
zero. The right-hand side can attain the value zero only if u*{t) is also an 
optimal control for x{t). In general for x{t) ^ this will not be so. 

Prom this observation, it follows that the expression on the right-hand 
side of (2.23) attains its maximum (of zero) at x{t) = x*{t). Further- 
more, x{t) is not explicitly constrained. In other words, x*{t) is an 
unconstrained local maximum of the right-hand side of (2.23), so that 
the derivative of this expression with respect to x must vanish at x* (t ) , 
i.e., 

Hx[x{t),u*{t),Vx{x{t),t),t]^Vtx{x{t),t)=:0. (2.24) 

In order to take the derivative as we did in (2.24), we must further assume 
that y is a twice continuously differentiable function of its arguments. 
Using the definition of the Hamiltonian in (2.17), the identity (1.15), and 
the fact that Vxx = (Vxx)^, we obtain 

Fa: + Va:U + = £. + 14/r + (Kro./)^ + Vta = 0, (2.25) 

where the superscript ^ denotes the transpose operation. See (1.16) or 
Exercise 1.9 for further explanation. 

The derivation of the necessary condition (2.25) is the crux of the 
reasoning in the derivation of the adjoint equation. It is easy to obtain 
the so-caUed adjoint equation from it. We begin by taking the time 
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derivative of Thus, 

dVxr,. \ 

dt - [ dt ^ dt dt ) 

~ {Vxix^ + ^Xiti ^X2X^ + ^X2ti ■ ■ ■ J ^XnX^ + ^nt) 

“ (Si=l ^xixi^it ^i=\ ^X2Xi^ii ■ ■ ■ ? ^i=l ^XnXi^i) + {^x)t 

= {V.xxf-^V:,t 



= (^xx/r+x^to. 

(2.26) 

Since the terms on the right-hand side of (2.26) are the same as the last 
two terms in (2.25), we see that (2.26) becomes 

^ = -F^ - (2.27) 

Because A was defined in (2.16) to be V^, we can rewrite (2.27) as 

A = —Fx — Xfx- 

To see that the right-hand side of this equation can be written simply as 
—Hx, we need to go back to the definition of H in (2.18) and recognize 
that when taking the partial derivative of H with respect to x, the adjoint 
variables A are considered to be independent of x. We note further that 
along the optimal path, A is a function of t only. Thus, 

A = -Hx. (2.28) 

Also, from the definition of A in (2.16) and the boimdary condition 
(2.15), we have the terminal boundary condition, which is also called 
the transversality condition: 

X(T) = U=.(T)= T], (2.29) 

The adjoint equation (2.28) together with its boimdary condition (2.29) 
determine the adjoint variables. 

From the definition of the Hamiltonian in (2.18), it is also obvious 
that the state equation can be written as 



x = f = H^, 



(2.30) 
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using an argument similar to the one used in the derivation of (2.28). 
The system of equations (2.28) and (2.30) along with their respective 
boundary conditions can be collected together and expressed as the sys- 
tem 



< 



X = Hx, 
A - 



o:(0) = xo, 

X(T) = S^[x{T),T], 



(2.31) 



called a canonical system of equations or canonical adjoints. Hence, the 
name adjoint vector for A. 

This completes om derivation of the maximum principle using dy- 
namic programming. We can now summarize the main results in the 
following section. 



2.2.3 The Maximum Principle 

The necessary conditions for u* to be an optimal control are: 

X* = /(a::*, ti*, t),x*(0) = rro, 

■ \ = \(T) = S^[x*{T),T], (2.32) 

for all u G 0(t), t G [0,T]. 

It should be emphasized that the state and the adjoint arguments 
of the Hamiltonian are x*{t) and \(t) on both sides of the Hamiltonian 
maximizing condition in (2.32), respectively,. Furthermore, u*(t) must 
provide a global maximum of the Hamiltonian jH[o:*(^), u, A(i),t] over 
u G D{t). For this reason the necessary conditions in (2.32) are called 
the maximum principle. 

Note that in order to apply the maximum principle, we must simulta^ 
neously solve two sets of differential equations with u* obtained from the 
Hamiltonian maximizing condition in (2.32). With the control variable 
u* so obtained, the state equation for x* is given with the initial value 
Xq, and the adjoint equation for A is specified with a condition on the 
terminal value A(T). Such a system of equations, where initial values of 
some variables and final values of other variables are specified, is called 
a two-point boundary value problem (TPBVP). The general solution of 
such problems can be very difficult; see Bryson and Ho (1969), Roberts 
and Shipman (1972), and Feichtinger and Hartl (1986). However, there 
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are certain special cases which are easy. One such is the case in which the 
adjoint equation is independent of the state and the control variables; 
here we can solve the adjoint equation first, then get the optimal control 
u*, and then solve for x*. 

In subsequent chapters we will solve many two-point boundary value 
problems of varying degrees of difficulty. 

Note also that if we can solve the Hamiltonian maximizing condition 
for an optimal control function in closed form as 

u*{t) = u[x*{t),X{t),t], 

then we can substitute these into the state and adjoint equations to get 
a two-point boundary value problem just in terms of a set of differential 
equations, i.e., 

I X* = f{x*,u(x*,X,t),t), a;*(0)=xo, 

[ \ = -H^(x*,u{x\\,t),t), X{T) = S^[x‘{T),T]. 

In Exercise 2.17, you are asked to derive such a two-point boundary value 
problem. 

One final remark should be made. Because an integral is imaffected 
by values of the integrand at a finite set of points, some of the arguments 
made in this chapter may not hold at a finite set of points. This does 
not affect the validity of the results. 

In the next section, we give economic interpretations of the maximum 
principle, and in Section 2.3, we solve five simple examples by using the 
maximum principle. 

2.2.4 Economic Interpretations of the Maximum 
Principle 

Recall from Section 2.1.3 that the objective function (2.3) is 

J = [ F{x,u,t)dt -{- S[x(T),T], 

Jo 

where F is considered to be the instantaneous profit rate measured in 
dollars per unit of time, and 5[a:, T] is the salvage value, in dollars, of 
the system at time T when the terminal state is x. For purposes of 
discussion it will be convenient to consider the system as a firm and the 
state x{t) as the stock of capital at time t. 
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In (2.16), we interpreted \{t) to be the per unit change in the value 
function V{x,t) for small changes in capital stock x. In other words, 
X(t) is the marginal value per unit of capital at time t, and it is also 
referred to as the price or shadow price of a unit of capital at time t. 
In particular, the value of A(0) is the marginal rate of change of the 
maximum value of J (the objective function) with respect to the change 
in the initial capital stock, xq. 

The interpretation of the Hamiltonian fimction in (2.18) can now be 
obtained. Multiplying (2,18) formally by dt and using the state equation 
(2.1) gives 



Hdt ~ Fdt + Xfdt = Fdt + Xxdt = Fdt + Xdx. 

The first term F(x, u, t)dt represents the direct contribution to J in dol- 
lars from time ttot-\-dt/\i the firm is in state 2 ; (i.e., it has a capital stock 
of x) , and we apply control u in the interval [^, t + dt]. The differential 
dx — f(x,u,t)dt represents the change in capital stock from time t to 
t-\-dt, when the firm is in state x and control u is applied. Therefore, the 
second term Xdx represents the value in dollars of the incremental cap- 
ital stock, dx^ and hence can be considered as the indirect contribution 
to J in dollars. Thus, Hdt can be interpreted as the total contribution 
to J from time t io t -\- dt when x{t) = x and u{t) — u in the interval 
[t, tF dt]. 

With this interpretation of the Hamiltonian, it is easy to see why the 
Hamiltonian must be maximized at each instant of time t. If we were just 
to maximize F at each instant t, we would not be maximizing J, because 
we would ignore the effect of control in changing the capital stock, which 
gives rise to indirect contributions to J. The maximum principle derives 
the adjoint variable A(^), the price of capital at time in such way that 
X{t)dx is the correct valuation of the indirect contribution to J from 
time ttotF dt. As a consequence, the Hamiltonian maximizing problem 
can be treated as a static problem at each instant t. In other words, the 
maximum principle decouples the dynamic maximization problem (2.4) 
in the interval [0, T] into a set of static maximization problems associated 
with instants t in [0,T]. Thus, the Hamiltonian can be interpreted as a 
surrogate profit rate to be maximized at each instant of time t. 

The value of A to be used in the maximum principle is given by (2.28) 
and (2.29), i.e., 



A= - 



dH 

dx 



dx dx' 



X(T) = SMT),T]. 
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Rewriting the first equation as 

—dA — Hxdt = Fxdt Afxdt^ 

we can observe that along the optimal path, the decrease dX in the price 
of capital from t to t + dt, which can be considered as the marginal cost 
of holding that capital, equals the marginal revenue Hxdt of investing 
the capital In tiun the marginal revenue, Hxdt, consists of the sum of 
direct marginal contribution, Fxdt, and the indirect marginal contribu- 
tion, A fxdt. Thus, the adjoint equation becomes the equilibrium rela^ 
tion — marginal cost equals marginal revenue, which is a familiar concept 
in the economics hterature. See, e.g., Cohen and Cyert (1965, p.l89) or 
Takayama (1974, p.7l2). 

Further insight can be obtained by integrating the above adjoint 
equation from t to T as follows: 

A(t) = A(T) + ff «(t), A(t), r)dr 

= SMT),T] + !'[ H^dr. 

Note that the price A{T) of a unit of capital at time T is its marginal 
salvage value, /S'a;[a;(T), T]. The price A{t) of a unit of capital at time t 
is the sum of its terminal price, A(T), plus the integral of the marginal 
surrogate profit rate, from t to T. 

The above interpretations show that the adjoint variables behave in 
much the same way as do the dual variables in linear (and nonhnear) 
programming. The differences being that here the adjoint variables are 
time dependent and satisfy derived differential equations. These con- 
nections will become clearer in Chapter 8, which addresses the discrete 
maximum principle. 



2.3 Elementary Examples 

In order to absorb the maximum principle, the reader should study very 
carefuUy the examples in this section, all of which are problems having 
only one state and one control variable. Some or all of the exercises at 
the end of the chapter should also be worked. 

In the following examples and others in this book, we shall omit the 
superscript * on the optimal value of the state variable when there is no 
confusion arising in doing so. 
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Example 2.1 Consider the problem: 



B 

11 

O 1 

1— > 

1 


(2.33) 


subject to the state equation 




X = u, x(0) = 1 


(2.34) 


and the control constraint 






(2.35) 



Note that T ~ 1, F = —x, 5 = 0, and f ~ u. Because F = —x, we can 
interpret the problem as one of minimizing the (signed) area under the 
curve x{t) for 0 < t < 1. 



Solution. First, we form the Hamiltonian 

H^~xFXu (2.36) 

and note that, because the Hamiltonian is linear in w, the form of the 
optimal control, i.e., the one that would maximize the Hamiltonian, is 



1 if \{t) > 0, 

u*{t) = <1 undefined if X{t) =0, 
-1 if A(t) < 0, 

or referring to the notation in Section 1.4, 

u*{t)= bang[-l,l;A(i)j. 

To find A, we write the adjoint equation 

A = = 1, A(l) = S^[x{T),T] = 0. 



(2.37) 



(2.38) 

(2.39) 



Because this equation does not involve x and w, we can easily solve it as 

A(t)=t-1. (2.40) 

It follows that X(t) = t — 1 <0 for all t € [0, 1] and since we can set 
ti*(l) = —1, which defines u at the single point t = 1, we have the 
optimal control 

u*(t) = -1 for t € [0, 1]. 
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Substituting this into the state equation (2.34) we have 

X = -1, o:(0) = 1, (2.41) 

whose solution is 

a;(t) = 1 — t for t G [0, 1]. (2.42) 

The graphs of the optimal state and adjoint trajectories appear in Figure 
2.2. Note that the optimal value of the objective function is J* ~ —1/2. 

jc, A 




Example 2.2 Let us solve the same problem as in Example 2.1 over the 
interval [0, 2] so that the objective is to 



maximize 




(2.43) 



The dynamics and constraints are (2.34) and (2.35), respectively, as be- 
fore. Here we want to minimize the signed area between the horizontal 
axis and the trajectory of x{t) for 0 < t < 2. 
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Solution. As before the Hamiltonian is defined by (2.36) and the opti- 
mal control is as in (2.38). The adjoint equation 

A=l, A(2)-0 (2.44) 

is the same as (2.39) except that now T = 2 instead of T = 1. The 
solution of (2.44) is easily found to be 

\{t)^t-2, tG [0,2]. (2.45) 

The graph of A(i) is shown in Figure 2.3. 

X, A 




Figure 2.3: Optimal State and Adjoint Trajectories for Example 2.2 

With X{t) as in (2.45), we can determine u*{t) = — 1 throughout. 
Thus, the state equation is the same as (2.41). Its solution is given by 
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(2.42) for t € [0,2]. The optimal value of the objective function is J* = 
0. The graph of x{t) is also sketched in Figure 2.3. 

Example 2.3 The next example is: 

max I J = 

Jo 

subject to the same constraints as in Example 2.1, namely, 

a: = w, a:(0) = 1, u G = [—1, 1]. (2.47) 

Here F — —{l/2)x^ so that the interpretation of the objective fimction 
(2.46) is that we are trying to find the trajectory x{t) in order that the 
area under the curve (l/2)a:^ is minimized. 



Solution. The Hamiltonian is 



H = + Au, 


(2.48) 


which is linear in u so that the optimal policy is 




u*(t) = bang [—1, 1; A]. 


(2.49) 


The adjoint equation is 




o 

II 

t-H 

II 

1 

II 


(2.50) 



Here the adjoint equation involves x so that we cannot solve it directly. 
Because the state equation (2.47) involves u, which depends on A, we 
also cannot integrate it independently without knowing A. 

The way out of this dilemma is to use some intuition. Since we want 
to minimize the area under (l/2)a:^ and since a;(0) = 1, it is clear that we 
want X to decrease as quickly as possible. Let us therefore temporarily 
assume that A is nonpositive in the interval [0, 1] so that from (2.49) we 
have u = —I throughout the interval. (In Exercise 2.5, you will be asked 
to show that this assumption is correct.) With this assumption, we can 
solve (2.47) as 

x{t) - 1 - t. (2.51) 

Substituting this into (2.50) gives 

A = 1-L 
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Integrating both sides of this equation from t to 1 gives 

f X{r)dr = f (1— r)(iT, 

Jt Jt 

or 

A(l) - A(i) = (t - Ij , 
which, using A(l) = 0, yields 

= + (2-52) 

The reader may now verify that \[t) is nonpositive in the interval [0, 1], 
verifying our original assumption. Hence, (2.51) and (2,52) satisfy the 
necessary conditions. In Exercise 2.6, you will be asked to show that 
they satisfy sufficient conditions derived in Section 2.4 as weU, so that 
they are indeed optimal. Figure 2.4 shows the graphs of the optimal 
trajectories. 



0 




• ► t 

2 



Figure 2.4: Optimal Trajectories for Examples 2.3 and 2.4 



Example 2.4 Let us rework Example 2.3 with T = 2, i.e., with the 
objective function: 
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max I J = (2.53) 

subject to the constraints (2.47). 

Solution. The Hamiltonian is still as in (2.48) and the form of the 
optimal policy remains as in (2.49). The adjoint equation is 

A = a;, A(2) - 0, 

which is the same as (2.50) except T == 2 instead of T = 1. Let us 
try to extend the solution of the previous example from T = 1 to T = 
2. Note from (2.52) that A(l) = 0. If we recall from the definition 
of the bang function that bang [— 1, 1;0] is not defined, it allows us to 
choose u in (2.49) arbitrarily when A = 0. This is an instance of singular 
control^ so let us see if we can maintain the singular control by choosing 
u appropriately. To do this we choose u = 0 when A = 0. Since A(l) = 0 
we set w(l) = 0 so that from (2.47), we have i:(l) = 0. Now note that 
if we set u{t) = 0 for t > 1, then by integrating equations (2.47) and 
(2.50) forward from t = 1 to t = 2, we see that x{t) — 0 and A(t) = 0 
for 1 < ^ < 2; in other words, u{t) = 0 maintains singular control in the 
interval. Intuitively, this is the correct answer since once we get x = 0, 
we should keep it at 0 in order to maximize the objective function J in 
(2.53). We will later give further discussion of singular control and will 
state an additional necessary condition in Appendix D.3 for such cases; 
see also Bell and Jacobson (1975). In Figure 2.4, we can get the singular 
solution by extending the graphs shown to the right (as shown by thick 
dotted line), making x{t) = 0 and u*{t) — 0 for 1 < t < 2. 

Example 2.5 Our last example is slightly more complicated and the 
optimal control is not bang-bang. The problem is: 



subject to 



max 




x — x-\-u^ a:(0) = 5 



and the control constraint 



(2.54) 

(2.55) 



u€f1= [0,2j. 



(2.56) 
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Solution. Here T ~ 2, F = 2x — 3u — , S = 0, and f = x-\-u. The 

Hamiltonian is 



H = {2x — Su — u^) -t- \{x + u) 

— (2 + A)^: — {u^ + Su — Xu). (2.57) 

Let us find the optimal control policy by differentiating (2.57) with re- 
spect to u. Thus, 

B ff 

— = -2u-3 + X = 0, 
ou 

so that the form of the optimal control is 

u{t) = (2.58) 

provided this expression stays within the interval 17 = [0, 2] . Note that 
the second derivative of H with respect to u is d‘^Hjdu^ = —2 < 0, so 
that (2.58) satisfies the second-order condition for the maximum of a 
function. 

We next derive the adjoint equation as 

BfJ 

A = = -2 - A, A(2) = 0, (2.59) 

which can be rewritten as 



A -h A = -2, A(2) = 0. 



This equation can be solved by the techniques explained in Appendix A. 
Its solution is 

A(t) = 2(e2-'-l). 

If we substitute this into (2.58) and impose the control constraint (2.56), 
we see that the optimal control is 



“*(*) = < 



2 if - 2.5 > 2, 

- 2.5 if 0 < e*-* - 2.5 < 2, 
0 if - 2.5 < 0, 



(2.60) 



or referring to the notation defined in (1.22), 



u*(t)— sat[0, 2;e^ ‘ — 2.5]. 
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The graph of w*(^) appears in Figure 2.5. In the figure, ti is the solution 
of — 2.5 = 2, i.e., t\ f^O.496, while t 2 solves — 2.5 = 0, which 
gives t 2 ~ 1.08. 

In Exercise 2.4 you will be asked to compute the optimal state trajec- 
tory x*{t) corresponding to u*{t) shown in Figure 2.5 by piecing together 
the solutions of three separate differential equations obtained from (2.55) 
and (2,60). 

u 




2.4 Sufficiency Conditions 

So far, we have shown the necessity of the maximum principle condi- 
tions for optimality. We next prove a theorem that gives qualifications 
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under which the maximum principle conditions are also sufficient for op- 
timality. This theorem is important from our point of view since the 
models derived from many management science applications will satisfy 
conditions required for the sufficiency result. As remarked earlier, our 
technique for proving existence will be to display for any given model, a 
solution that satisfies both necessary and sufficient conditions. A good 
reference for sufficiency conditions is Seierstad and Sydsaeter (1987). 

We first define a function x E'^ xE^ E^ called the derived 

Hamiltonian as follows; 

H^(x,\t)= max H(x,u,X,t). (2.61) 

ueQ(t) 

We assume that by this equation a function u = vP{x^ A, t) is implicitly 
and uniquely defined. Given these assumptions we have by definition, 

H°{x, A, t) = H{x, u°, A, t). (2.62) 

It is also possible to show that 

Hl{x, A, t) = H^(x, u°, A, t). (2.63) 

To see this for the case of differentiable vP, let us differentiate (2.62) with 
respect to x: 

H2(x, a, t) = H4x, u®, A, t) + Hu(x, u°, A, t)—. (2.64) 

Let us look at the second term on the right-hand side of (2.64). We 
must show 

du^ 

Hu{x,u°,\,t)— = 0 (2.65) 

for all X. There are two cases to consider: (i) The unconstrained global 
maximum of H occurs in the interior of Q{t). Here Hu{x, vP, A, t) = 0. (ii) 
The imconstrained global maximum of H occurs outside of 0(t). Here 
dvP / — 0, because changing x does not influence the optimal value of 
u. Thus (2.65) and, therefore, (2.63) hold. Exercise 2.15 gives a specific 
instance of this case. 

Remark 2.1. We have shown the result in (2.63) for cases where vP is 
a differentiable fimction of x. It holds more generally provided ft(t) is 
appropriately qualified; see Derzko, Sethi, and Thompson (1984). 
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Theorem 2,1 (Sufficiency Conditions). Letu*{t), and the correspond- 
ing x*{t) and satisfy the maximum principle necessary condition 
(2.32) for all t G [0, T]. Then^ u* is an optimal control if H^{x^ X(t),t) 
is concave in x for each t and S{x, T) is concave in x. 

Proof. The proof is a minor extension of the arguments in Arrow and 
Kurz (1970) and Mangasarian (1966). By definition 

H[x{t),u{t),X{t),t] < H^[x{t),X{t)^t]. (2.66) 

Since is differentiable and concave, we can use the applicable defini- 
tion of concavity given in Section 1.4 to obtain 

A(4),t] < lT’[x\t),X{t),t] + H^[x*{t),X{t),t][x(t) -x*(t)]. 

(2.67) 

Using (2.66), (2.62), and (2.63) in (2.67), we obtain 

H[x(t)^u{t),X{t),t] < H[x*{t),u*{t)^X{t),t] 

■}^Hx[x*{t),u*{t),X{t),t][x{t) -x*(t)]. (2.68) 

By definition of H in (2.18) and the adjoint equation of (2.32) 

F[x{t),u{t)T]+ X(t)f[x(t),u{t),t] < F[x*(t),u*{t),t] 

FX{t)f[x*{t),u*{t),t] 
-X{t)[x{t)-x*(t)]. (2.69) 

Using the state equation in (2.32), transposing, and regrouping, 

F[x* (t ) , u* (t ) , t] — F[x(t ) , u{t ) , t] > X(t) [a;(^ ) — x* (i )] 

+X{t)[x{t)-x*{t)]. (2.70) 

Furthermore, since S'[rr, T] is a concave function in its first argument, we 
have 



5[:r(T),r] < Slx*iT),T]FS,[x*{T),T][x(T)-x*{T)] (2.71) 

or, 

S[x*{T),T] ~ S[x{T)ffi] + S^[x*(T),T][x{T) - x*{T)] > 0. (2.72) 

Integrating both sides of (2.70) from 0 to T and adding (2.72), we have 

J{u*) - J(u) + S^[x*{T),T][x{T)--x*{T)] 

> X(T)[x{T) - x*(T)} - X(0)[x(0) - x*(0)], 



(2.73) 
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where J(u) is the value of the objective function associated with a control 
u. Since a:*(0) — rr(0) = xq, the initial condition, and since A(T) = 
Sx[x*(T),T] from the terminal adjoint condition in (2.32), we have 

J{u*) > J{u). (2.74) 

Thus, u* is an optimal control. This completes the proof. □ 

Note that if the problem is given in the Lagrange form, i.e., S{x, T) = 

0, then all we need is the concavity of in x for each t. Since 

X(t) is not known a priori^ it is usual to test for a stronger assumption, 

1. e., to check for the concavity of the function H^{x^ A(t),f) in x for any 
A and t. Sometimes the stronger condition given in Exercise 2.19 can be 
used. 

Example 2.6 Let us show that the problems in Examples 2.1 and 2.2 
satisfy the sufficient conditions. We have from (2.36) and (2.61), 

H^ = -xJr Au°, 

where u® is given by (2.37). Since u® is a function of A only, i?^(x. A, t) is 
certainly concave in x for any t and A (and in particular for \{t) supplied 
by the maximiun principle). Since S{x,T) = 0, the sufficient conditions 
hold. 

Finally, it is important to mention that thus far in this chapter, we 
have considered problems in which the terminal values of the state vari- 
ables are not constrained. Such problems are called free- end-point prob- 
lems. The problems at the other extreme, where the terminal values of 
the state variables are completely specified, are termed fixed- end-point 
problems. Then, there are problems in between these two extremes. 
While a detailed discussion of terminal conditions on state variables ap- 
pears in Section 3.4 of the next chapter, it is instructive here to briefly 
indicate how the maximum principle needs to be modified in the case 
of fixed-end-point problems. Suppose x{T) is completely specified, i.e., 
x{T) = a ^ E^, where a is a vector of constants. Observe then that 
the first term on the right-hand side of inequality (2.73) vanishes regard- 
less of the value of A(T), since x(T) — x*(T) = a — a = 0 in this case. 
This means that the sufficiency restilt would go through for any value of 
A(T). Not surprisingly, therefore, the transversality condition (2.29) in 
the fixed-end-point case changes to 



A(T) = K, 



(2.75) 
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where K G E'^ is a vector of constants to be determined- The maximum 
principle for fixed-end-point problems can be restated as (2.32) with 
x{T) = a and without A(T) = Sx[x*{T),T]. The resulting two-point 
boimdary value problem has initial and final values on the state variables, 
whereas both initial and terminal values for the adjoint variables are 
unspecified, i.e., A(0) and X{T) are constants to be determined. 

In Exercises 2.9 and 2.18, you are asked to solve the fixed-end-point 
problems given there. 

2.5 Solving a TPBVP by Using Spreadsheet 
Software 

A number of examples and exercises found in the rest of this book in- 
volve finding a numerical solution to a two-point boundary value problem 
(TPBVP). In this section we shall show how the GOAL SEEK fimction 
in the EXCEL spreadsheet software can be used for this purpose. We 
will solve the following example. 

Example 2.7 Consider the problem: 

max ~ y^)dt 



subject to 

X = —x^ + u, x(0) — 5. (2.76) 

Solution. We form the Hamiltonian 

H = —^(x^ + + A(— -I- u), 

where the adjoint variable A satisfies the equation 

A = X + 3x^A, A(l) = 0. (2.77) 

Since u is unconstrained, we set Hu = 0 to obtain u* = A. With this, 
the state equation (2.76) becomes 

X = -x^ + A, x(0) = 5. (2.78) 

Thus, the TPBVP is given by the system of equations (2.77) and (2.78). 
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In order to solve these equations we discretize them by replacing 
dxjdt and dXjdt by 

Ax _ x{t + At) — x{t) ^ AA _ \{t -i- At) — A(t) 

At ^ At At "" At ’ 

respectively. Substitution of Ax/ At for x in (2.78) and AA/At for A in 
(2.77) gives the discrete version of the TPBVP: 

x(t -f- At) = x(t) -h [— x(t)^ + A(t)] A t, x(0) = 5, (2.79) 

A(t + At) = A(t) + [x(t) + 3x(t)^A(t)] A t, A(l) = 0. (2.80) 

In order to solve these equations, open an empty spreadsheet, choose 
the unit of time to be At = 0.01, make a guess for the initial value A(0) 
to be, say —0.2, and make the entries in the cells of the spreadsheet as 
specified below: 

Enter — 0.2 in cell Al. 

Enter 5 in cell Bl. 

Enter = Al + (Bl + 3 * (BV2) * Al) * 0.01 in cell A2. 

Enter = Bl + (-Bl^S + Al) * 0.01 in ceU B2. 

Note that A(0) = —0.2 shown as the entry —0.2 in cell Al is merely a 
guess. The correct value will be determined by the use of the GOAL 
SEEK function. 

Next blacken cells A2 and B2 and drag the combination down 
to row 101 of the spreadsheet. Using EDIT in the menu bar, select 
FILL DOWN. Thus, EXCEL will solve equations (2.79) and (2.80) 
numerically from t = 0 to i = 1 in steps of At = 0.01, and that solution 
will appear as entries in columns A and B of the spreadsheet. In other 
words, the guessed solution for A(t) will appear in cells Al to A 101 and 
the guessed solution for x{t) will appear in cells Bl to BIOL In order 
to find the correct value for A(0), use the GOAL SEEK function under 
TOOLS in the menu bar and make the following entries: 

Set cell: AlOl. 

To value: 0. 

By changing cell: Al. 



It finds the correct initial value for the adjoint variable as A(0) = 
—0.10437, which should appear in cell Al, and the correct ending value 
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of the state variable as x(l) = 0.62395, which should appear in cell BlOl. 
You will notice that the entry in cell AlOl may not be exactly zero as 
instructed, although it will be very close to it. In our example, it is 
—0.0007. By using the CHART function, the graphs of x(t) and X(t) can 




EXERCISES FOR CHAPTER 2 

2.1 (a) In Example 2.1, show J* = 

(b) In Example 2.2, show J* = 0. 

(c) In Example 2.3, show J* = — 

(d) In Example 2.4, show J* = — 1. 

2.2 Rework Example 2.5 with F = 2x — 3u. 

2.3 Show that both the Lagrange and Mayer forms of the optimal 
control problem can be reduced to the linear Mayer form (2.5). 

2.4 Complete Example 2.5 by writing the optimal x*{t) in the form of 
integrals over the three intervals (0, ti), (ti, ^ 2 )? and (^ 2 , 2 ) shown 
in Figure 2.5. 

[Hint: It is not necessary to actually carry out the numerical eval- 
uation of these integrals unless you are ambitious.] 
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2.5 In Example 2.3, show that A(t) cannot be positive over any finite 
interval in [0, 1]. 

2.6 Show that the derived Hamiltonian found in Examples 2.4 
and 2.5 satisfies the concavity condition required for the sufficiency 
result in Section 2.4. 

2.7 Show that the optimal control obtained from the application of the 
maximum principle satisfies the principle of optimality: if u*{t) is 
an optimal control and x* (t) is the corresponding optimal path for 
0 < t < T with x(0) = xq, then verify the above proposition by 
showing that u*(t) for r < t < T satisfies the maximum princi- 
ple for the problem beginning at time r with the initial condition 
x{r) = x*(t). 

2.8 Use the maximum principle to solve the following problem given in 
the Mayer form: 

max [8o:i(18) 4-4x2(18)] 

subject to 

iri = xi 4- X2 4 u, xi(0) = 15, 

X 2 = 2xi — u, :T 2 ( 0 ) = 20, 
and the control constraint 



1 . 



[Hint: Use the method in Appendix A to solve the simultaneous 
differential equations.] 



2.9 



A simple controlled dynamical system is modeled by the scalar 
equation 

X = X + u. 



The fixed-end-point optimal control problem consists in steering 
x(t) from an initial state x(0) = xq to the target x(l) = 0, such 



that 



1 



u^dt 



is minimized. Use the maximmn principle to show that the optimal 
control is given by 



o 
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2.10 Regional Allocation of Investment. Let Ki^i = 1,2, denote the 
capital stock in Region i. Let bi be the productivity of capital 
and Si be the marginal propensity to save in Region i. Since the 
investment fimds for the two regions come from the savings in the 
whole economy, we have 

ki + K2 = bisiKi + b2S2K2 = giKi + g2K2, 

where gi = bisi. Let u denote the control variable representing the 
fraction of investment allocated to Region 1 with the remainder 
going to Region 2. Clearly, 

0 < n < 1, (2.81) 

and 

ki = uigiKi+g2K2),Ki{0) = ai>0, (2.82) 

K2 = {l-u){giKi+g2K2),K2(0) = a2>0. (2.83) 

The optimal control problem is to maximize the productivity of 
the whole economy at time T. Thus, the objective is to 

maximize {J = biKi{T) -f- 62 ^2 (T)} 

subject to (2.81), (2.82), and (2.83). 

(a) Use the maximum principle to derive the form of the optimal 
policy. 

(b) Assume 62 > 61. Show that u*{t) = 0 for t G [i,T], where i is 
a switching point and 0 <i <T. 

(c) If you are ambitious, find the i of part (b). 

2 . 11 * The system defined in (2.4) is termed autonomous if F, /, and O 
are not explicit fimctions of time t. In this case, show that the 
Hamiltonian is constant along the optimal path, i.e., show that 




2.12 A water reservoir (Figure 2.7) being used for the purpose of fire- 
fighting is leaking, and its water height x{t) is governed by 

X = —O.lx 4- u, o;(0) = 10, 
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Figure 2.7: Water Reservoir of Example 2,12 



where u(t) denotes the net inflow at time t and 0 < w < 3. 

Note that x{t) also represents the water pressure in appropriate 
units. Since high water pressure is useful for flre-flghting, the ob- 
jective function in (a) below involves keeping the average pressure 
high, while that in (b) involves building up a high pressure at 
T = 100. 

(a) Find the optimal control which maximizes 

^100 

/ xdt. 

Jo 

Find the maximum level reached. 

(b) Replace the objective fimction in (a) by 

J=5x{lQ0), 



and re-solve the problem, 

(c) Redo the problem with J — — t>u)dt. 

2.13 A Machine Maintenance Problem. Consider the machine state dy- 
namics 

X = ~dx -f- n, a;(0) = rro > 0, 

where d > 0 is the rate of deterioration of the machine state and u 
is the rate of machine maintenance. Find the optimal maintenance 
rate so as to 



maximize 




~)dt + e~f‘'^Sx(T)^ 
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where tt > 0 with ttx representing the profit rate when the machine 
state is x, v? j 2 is the cost of maintaining the machine at rate w, 
p > 0 is the discount rate, T is the time horizon, and S' > 0 is 
the salvage value of the machine for each unit of the machine state 
at time T. Furthermore, show that the optimal maintenance rate 
decreases, increases, or remains constant over time depending on 
whether the difference S — 'k/{p -\- 6) is negative, positive, or zero, 
respectively. 

2.14 (a) Solve the optimal consumption problem of Example 1.3 with 
U{C) = InC and B = 0. 

[Hint: Since C{t) > 0, we can replace the state constraint W{t) > 0 
by the terminal condition W{T) = 0, and then use the transver- 
sality condition of Exercise 2.13.] 

(b) Find the rate of change of optimal consumption over time and 
conclude that consumption remains constant when r = p, increases 
when r > p, and decreases when r < p. 

2.15 Suppose H{x^ u, A, t) = Xux — and Cl(t) = [0, 1] for all t. 

(a) Show that the optimal control u* is given by 



u*(x) = sat [0, 1; Ax] = < 



Ax if 0 < Ax < 1, 
1 if Ax > 1, 

0 if Ax < 0. 



(b) Verify that (2.63) holds for all values of x and A. 

2 . 16 * Provide an alternative derivation of the adjoint equation in Section 
2.2.2 by starting with a restatement of the equation (2.18) as —Vt = 
and differentiating it with respect to x. 

2.17 (a) State the two-point boundary value problem (TPBVP) whose 
solution will give the optimal control u*{t) and trajectory x*{t) 
for the problem in Example 2.7, but with a new initial condition 
x(0) = 1. 

(b) Solve the TPBVP by using a spreadsheet software such as 
EXCEL. 




Exercises for Chapter 2 
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2.18 Consider the following fixed-end-point problem: 




subject to 

X = f{x) + b{x)u, x(0) = xq, x(T) = 0, 

where functions g >0, f, and b are assumed to be continuously dif- 
ferentiable. Derive the two-point boimdary value problem satisfied 
by the optimal state and control trajectories. 

2.19 If F and / are concave in x and u and if X(t) >0, then show that the 
derived Hamiltonian is concave in x. Note that the concavity 
of F and / are easier to check than the concavity of as required 
in Theorem 2.1 on sufficiency conditions. 




Chapter 3 

The Maximum Principle: 
Mixed Inequality 
Constraints 



The problems to which the maximum principle derived in the previous 
chapter was appUcable had constraints involving only the control vari- 
ables. We shall see that in many applied models it is necessary to impose 
constraints involving both control and state variables. Inequality con- 
straints involving control and possibly state variables are called mixed 
inequality constraints. 

In the solution spaces of problems with mixed constraints, there may 
be regions in which one or more of the constraints is tight. When this 
happens, the system must be controlled in such a way that the tight 
constraints are not violated. As a result, the maximum principle of 
Chapter 2 must be revised so that the Hamiltonian is maximized subject 
to the constraints. This is done by appending the Hamiltonian with the 
mixed constraints and the associated Lagrange multiphers to form a 
Lagrangian, and then setting the derivatives of the resulting Lagrangian 
with respect to the control variables to zero. 

In Section 3.1, a Lagrangian form of the maximum principle is dis- 
cussed for models in which there are some constraints which involve only 
control variables, and others which involve both state and control vari- 
ables simultaneously. Problems having pure state variable constraints, 
i.e., those involving state variables but no control variables, will be dealt 
with in Chapter 4. 

In Section 3.2, we state conditions under which the Lagrangian max- 
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imum principle is also sufficient for optimality. 

Economists frequently analyze control models having an infinite hori- 
zon together with a continuous discount rate. By combining the discount 
factor with the adjoint variables and the Lagrange multipliers and mak- 
ing suitable changes in the definitions of the Hamiltonian and Lagrangian 
functions, it is possible to derive the current-value formulation of the 
maximum principle as described in Section 3.3. 

Terminal conditions for each of the above models are discussed in 
Section 3.4, and the models with infinite horizons and their stationary 
equilibrium solutions are covered in Section 3.5. 

Section 3.6 presents a classification of a number of the most important 
and commonly used kinds of optimal control models, together with a brief 
description of the forms of their optimal solutions. The reader may wish 
to refer to this section from time to time while working through later 
chapters in the book. 



3.1 A MgLximum Principle for Problems with 
Mixed Inequality Constraints 

We will state the maximum principle for optimal control problems with 
mixed inequality constraints without proving it and without being rig- 
orous. For further details see Hestenes (1966), Arrow and Kurz (1970), 
Hadley and Kemp (1971), Bensoussan, Hurst, and Naslund (1974), Fe- 
ichtinger and Hartl (1986), and Seierstad and Sydsaeter (1987). For a 
review of the literature, see Hartl, Sethi, and Vickson (1995). 

Let the system under consideration be described by the following 
vector differential equation 

X = f[x,u,t), x{tf) = Xq (3.1) 

given the initial conditions xq and a control trajectory u{t), t G [0,T]. 
Note that in the above equation, x(t) G and u(t) G E^, and the 
function / : E^ x E^ x E^ ^ E'^ is assumed to be continuously differ- 
entiable. 

Let us consider the following objective: 

max I J = J F(x,u,t)dt S[x{T),T]^ , (3.2) 

where F : E^ x EJ^ x E^ ^ E^ and S : E^ x E^ E^ are continuously 
differentiable fimctions and where T denotes the terminal time. 
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Next we impose constraints on state and control variables. Specifi- 
cally, for each t £ [0,T], x{t) and u(t) must satisfy 

g(x, u,t)>0, tG [0, T], (3.3) 

where g: x E'^ x E^ ^ E^ is continuously differentiable in all its 

arguments and must contain terms in u. Inequality constraints without 
terms in u wiU be introduced later in Chapter 4. 

It is important to note that the mixed constraints (3.3) allow for 
inequality constraints of the type g(u, t) > 0 as special cases. Thus, 
the control constraints of the form u{t) G H(t) treated in Chapter 2 
can be subsumed in (3.3), provided it can be expressed in terms of a 
finite number of inequality constraints of the form g{u, t) > 0. In most 
problems that are of interest to us, this will indeed be the case. Thus, 
from here on, we shall formulate control constraints either directly as 
inequahty constraints and include them as parts of (3.3), or as u{t) € 
^{t), which can be easily converted into a set of inequality constraints 
to be included as parts of (3.3). 

Finally, the terminal state is constrained by the following inequality 
and equality constraints: 

a{x(T),T)>0, (3.4) 

b(x{T),T) = 0, (3.5) 

where a : E'^ x E^ ^ E^°- and h : E^ x E^ E^^ are continuously 
differentiable in all their arguments. 

We can now define a control u{t), t G [0, T], or simple u, to be admissi- 
ble if it is piecewise continuous and it, together with the state trajectory 
x{t),t G [0, T], it generates, satisfies the constraints (3.3), (3.4), and 
(3.5). 

Before proceeding further, we note that an interesting case of the 
terminal inequality constraints is the constraint of the type 

x{T) gYcX, (3.6) 

where y is a convex set and X is the set of all feasible terminal states, 
also called the reachable set from the initial state xq, i.e., 

X = {iP(T) I x{T) obtained by an admissible control u and (3.1)}. 

Note that the constraint (3.6) does not depend explicitly on T. Note also 
that the feasible set defined by (3.4) and (3.5) need not be convex. Thus, 
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if the convex set Y can be expressed by a finite number of inequalities 
a(x{T)) > 0 and equalities b{x(T)) = 0, then (3.6) becomes a special 
case of (3.4) and (3.5). In general, (3.6) is not a special case of (3.4) and 

(3.5) , since it may not be possible to define a given y by a finite number 
of inequalities and equalities. 

In this book, we shall only deal with problems in which the following 
full-rank conditions hold. That is, 

rank [dg/ du, diag(^?)] = q 

holds for all arguments x{t)^ u(^), t, that could arise along an optimal 
solution, and 

dajdx diag(a) 

rank 

dhjdx 0 

hold for all possible values of x{T) and T. These conditions are also re- 
ferred to as the constraint qualification. The first of these conditions 
means that the gradients with respect to u of all active constraints 
in (3.3) must be linearly independent. Similarly, the second condition 
means that the gradients with respect to x of the equality constraints 

(3.5) and of the active inequality constraints in (3.4) must be linearly 
independent. 

To state the maximum principle we define the Hamiltonian function 
H : X X X ^ as 

H[x, u, A, t] := F{x, u, t) -(- Xf{x, u, t) , (3.7) 

where X G E^ {a row vector). We also define the Lagrangian fimction 
L : E^ X E^ X E'^ X E^ X E^ ^ E^ as 

L[x, u, A, /z, t] := H(x, u, A, t) -{- pg(x, u, t), (3.8) 

where p G E'^ is a row vector, whose components are called Lagrange 
multipliers. These Lagrange multipliers satisfy the complimentary slack- 
ness conditions 

p > 0^ pg{x,u,t) = 0. (3.9) 

The adjoint vector satisfies the differential equation 

A Lx\x^ Ur, A, p^ tj 



(3.10) 
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with the boundary conditions 

A(T) = S^(x(T),T) + aa:,{x{T),T) + pb^{x{T),T), 

a > 0, aa(x(T), T) = 0, 

where a € and j3 G are constant vectors. 

The maximum principle states that the necessary conditions for u*, 
with the corresponding state trajectory x* , to be an optimal control are 
that there should exist continuous and piecewise continuously differen- 
tiable functions A, piecewise continuous functions p, and constants a and 
/? such that (3.11) holds, i.e., 

X* = u*, t), x*(0) = Xo, 

satisfying the terminal constraints 
a(x*(T),T) > 0 and b(x*(T),T) = 0, 

A = ~Lx[x*,u*,A,p,t] 

with the transversality conditions 

A(T) = Sx(x*(T),T)+aax(x*(T),T)-^^bx(x*(T),T), 

a > 0, aa(x*(T),T) = 0, 

the Hamiltonian maximizing condition (3-1 1) 

If[x*(t),u*(t), A(t),t] > H[x*(t),u, 
at each t G [0, T] for aU u satisfying 
g[x*(t),u,t] > 0, 

and the Lagrange multipliers f^(t) are such that 
dL _f9H dg\ _ 

du '~\du^^du) “ 

and the complementary slackness conditions 
m(^) ^ = 0 hold. 
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In the case of the terminal constraint (3.6), note that the terminal 
conditions on the state and the ad.joint variables in (3.11) will be re- 
placed, respectively, by 

x*{T) eY cX (3.12) 

and 

[\{T)-S,(x*{T),T)][y-x^{T)]>0, Vy^Y. (3.13) 

In Exercise 3.31, you are asked to derive (3.13) from (3.11) in the one 
dimensional case when Y = \x^x], where x and x are two constants such 
that X > X. 

Furthermore, if the terminal time T in the problem (3.1)- (3. 5) is 
unspecified, there is an additional necessary transversality condition for 
T* to be optimal (see Exercise 3.5), namely, 

i/[:r*(r*),w*(r*),A(r*),T*] +5T[o:*(Tn,T*] =0, (3.14) 

provided T* is an interior solution, i.e,, T* G (0, oo). The use of this 
condition will be illustrated in Example 3.5 solved later in the chapter. 
Moreover, if T is restricted to lie in the interval, [Ti,T 2 ], where T 2 > 
Ti > 0, then (3.14) is still valid provided T* G (Ti, T 2 ). If T* = Ti, then 
the equality (3.14) is replaced by <, and if T* = T 2 , then the equahty 
is replaced by >; see Hestenes (1966). You will find these observations 
useful in Exercises 3.33 and 3.34 due to Seierstad and Sydsseter (1987). 

Remark 3.1 Strictly speaking, we should have H ~ XqF -f- A/ in (3.7) 
with Ao > 0. However, we can set Aq = 1 in most applications that are 
of interest to us; see Hartl, Sethi, and Vickson (1995) for details. 

Remark 3.2 It should be pointed out that if the set Y in (3.6) consists 
of a single point Y = {A;}, making the problem a fixed-end-point prob- 
lem, then the transversality condition reduces to simply A*(T) equals a 
constant to be determined, since x*{T) = k. In this case the salvage 
function S becomes a constant, and can therefore be disregarded. When 
Y = X, the terminal condition in (3.11) reduces to (2.28). Further dis- 
cussion of the terminal conditions is given in Section 3.4 along with a 
summary in Table 3.1. 

Example 3. 1 Consider the problem: 

max {J = I udt } 

Jo 
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subject to 

i = u, x(0) = 1, (3.15) 

u > 0^x — u>0. (3.16) 

Note that constraints (3.16) are of the mixed type (3.3). They can also 
be rewritten as 0 < u < a;. 

Solution. The Hamiltonian is 

H = u-\- Xu = (1 + X)u, 

so that the optimal control has the form 

u* = bang[0,rr; 1 + A]. (3-17) 

To get the adjoint equation and the multipliers associated with con- 
straints (3.16), we form the Lagrangian: 

L = H ji-pi -i- — u)= {I X + pi — P2)u. 

From this we get the adjoint equation 

A = A(l) = 0. (3.18) 

Also note that the optimal control must satisfy 

^ r 

= 1 + X A Pi — P2 — (3.19) 

and Pi and p 2 must satisfy the complementary slackness conditions 

P\ > 0, P\U = 0, (3.20) 

P2 ^ 0, p2{x — u) = 0. (3.21) 

It is obvious for this simple problem that u*{t) = x(t) should be the 
optimal control for all t G [0, 1]. We now show that this control satisfies 
all the conditions of the Lagrangian form of the maximum principle. 

Since o:(0) — 1, the control u* = x gives as the solution of 

(3.15). Because x = > 0, it follows that u* = x > 0; thus Pi = 0 from 

(3.20). 

Prom (3.19) we then have 

P2 = I P X. 

Substituting this into (3.18) and solving gives 

1 + X{t) = (3.22) 

Since the right-hand side of (3.22) is always positive, u* = x satisfies 
(3.17). Notice that p 2 = > 0 and :r — u* = 0, so (3.21) holds. 
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3.2 Sufficiency Conditions 

In this section we wiU state without proof a number of sufficiency results. 
These results require the concepts of concave and quasiconcave functions. 

Recall from Section 1.4 that with D C a convex set, a function 
'tp : D is concave^ if for sTiy^z ^ D and for all p € [0, l], 

ip{py + (1 - p)z) > p'ip{y) + (1 - p)ip{z). (3.23) 

The function xp is quasiconcave if (3.23) is relaxed to 

xp{py + (1 - p)z) > imn{xp{y), xp{z)}. (3.24) 

Finally, xp is strictly concave iiy ^ z and p G (0, 1) and (3.23) holds with 
a strict inequality. 

We should also mention that xp is convex, quasiconvex, or strictly 
convex if —xp is concave, quasiconcave, or strictly concave, respectively. 
For further details on the properties of such functions, see Mangasarian 
(1969). 

We can now state two sufficiency results concerning the problems 
with mixed constraints stated in (3.1)-(3.6). For their proofs, see Seier- 
stad and Sydsaeter (1987, p.287). 

Theorem 3.1 Let (a;*,ti*, A,)it, q,/9) satisfy the necessary conditions in 
(3.11). If H{x,u,X(t),t) is concave in {x,u) at each t G [0, T], S in 
(3.2) is concave in x, g in (3.3) is quasiconcave in (x^u), a in (3.4) is 
quasiconcave in x, and h in (3.5) is linear in x, then (x*,u*) is optimal. 

The concavity of the Hamiltonian with respect to (j:, u) is a crucial con- 
dition in Theorem 3.1. Unfortunately, a number of management science 
and economics models lead to problems that do not satisfy this concavity 
condition. For this reason, we want to generalize Theorem 2.1 of Chapter 
2 to the control problem involving mixed constraints under consideration 
in this chapter. This provides the second sufficiency result in the case of 
mixed constraints. The result replaces the concavity requirement on the 
Hamiltonian in Theorem 3.1 by a concavity requirement on where 

H^{x, X,t)= max H{x, u, A, t) 

{u\g{x,u,t)>e} 



Theorem 3.2 Theorem 3.1 remains valid if 

t e [o,r], 



(3.25) 
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and, if in addition, we drop the quasiconcavity requirement on g and 
replace the concavity requirement on H in Theorem 3.1 hy the following 
assumption: For each t G [0,T], if we define A\{t) — {x\u, g(x,u,t) > 0 
for some u}, then H^{x, \{t),t) is concave on Ai(t), if Ai(t) is convex. 
If Ai (t) is not convex, we assume that has a concave extension to 
co{Ai{t)), the convex hull ofAi(t). 

In Exercise 3.3 you are asked to check sufficiency conditions stated 
in Theorem 3.1 for Example 3.1. 

3.3 Current- Value Formulation 

In most management science and economics problems, the objective func- 
tion is usually formulated in money or utility terms. The future streams 
of money or utihty are usually discoxmted. 

For this section, let us assume a constant continuous discount rate 
p > 0. The discounted objective function can now be written as a spe- 
cial case of (3.2) by assuming that the time dependence of the relevant 
functions comes only through the discoimt factor. Thus, 

F(x,u,t) = (j){x,u)e~^^ and S(x,T) = a(x)e~^^. 

Now, the objective is to 

maximize | J = J cl){x,u)e~^^dt o-[x{T)]e~^^ 

subject to (3.1) and (3.3)-(3.5). 

For this problem, the standard Hamiltonian is 

:= e~P^(f){x, u) -\- X^f{x, u, t) 

and the standard Lagrangian is 

:= W F lL^g{x,u,t) (3.28) 

with the standard adjoint variables A® and standard multipliers and 
satisfying 

= -LI, (3.29) 

A*(r) = [x(T) , T\ + a (x(T) , T) + 13%,, (x(T) , T) 

= e~'^<7,lx{T)]+a%,{x{T),T)+l3%„(x(T),T), (3.30) 



(3.26) 



(3.27) 
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a^>0, a^a{x{T),T) = 0, (3.31) 

and /j,^ satisfying 

A^^>0, p^g = 0. (3.32) 

We use superscript s in this section to distinguish these from the 
cmrent-value fimctions defined in the following. Elsewhere, we do not 
need to make the distinction explicitly since we will either be using the 
standard definitions or the current-value definitions of these functions. 
The reader wiU always be able to teU from the context which is meant. 
We now define the current-value Hamiltonian 

H[x, u, A, t] := (j){x, u) + Xf{x, u, t) (3.33) 

and the current-value Lagrangian 

L[x,u, X, p,t] := H -h /j,g(x,u,t). (3.34) 

To see why we can do this, we note that if we define 

A := e^^A'® and p := (3.35) 

we can rewrite (3.27) and (3.28) as 

H = and L = (3.36) 

Since > 0, maximizing with respect to u at time t is equivalent to 
maximizing the current-value Hamiltonian H with respect to u at time 
t. Furthermore, from (3.35) 

X = peP^X^ + eP^X\ (3.37) 

The first term on the right-hand side of (3.37) is simply pX using the 
definition in (3.35). To simplify the second term we use the differential 
equation (3.29) for A^ and the fact that Lx = e^^L^ from (3.36). Thus, 

A = pX Lxi 

KT) = + aa^{x{T),T) + l3b,{x{T),T), (3.38) 

where the terminal condition for A(T) follows immediately from the ter- 
minal condition for A^(T) in (3.30), the definition (3.36), 

a = and (3 = eP^(3^ . 



(3.39) 
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The complimentary slackness conditions satisfied by the current- 
value Lagrange multipliers and a are 

jj- > 0, jig = 0, a > 0, and aa = 0 

on account of (3.31), (3.32), (3.35), and (3.39). 

Finally, the current- value version of (3.14), i.e., the necessary 
transversality condition for T* to be optimal, is 

H[x*{T*),u*{T*),X{T*),T*] - pa[x*{T*)] =0. (3.40) 

You are asked to prove this result in Exercise 3.8. 

We will now state the maximum principle in terms of the current- 
value functions. It states that the necessary conditions for u* to be an 
optimal control are that there exist A and ji such that the conditions 
(3.41) hold, i.e.. 



X* = f{x*,u*,t), 

a(o:*(r),T)>0, b(x‘{T),T) = Q, 

X = pX- Lx[x*, «*, A, p, t], 
with the terminal conditions 

X(T) = ax{x*(T)) + aax(x'{T),T)+pbx(x*(T),T), 

a > 0, aa{x*(T)^T) = 0, 

and the Hamiltonian maximizing condition 

H[x* (t ) , u* (t ) , \{t ) , t] > H[x* (t ) , u, \{t ) , t] 

at each t G [0, T] for all u satisfying 

g[x*{t),u,t] > 0, 

and the Lagrange multipliers ji{t) are such that 

O T . 

= 0? the complementary slackness 
conditions ji{t) > 0 and ji(t)g(x* ,u* , t) = 0 hold. 



(3.41) 



As in Section 3.1, when the terminal constraint is given by (3.6) 
instead of (3.4) and (3.5), we need to replace the terminal condition on 
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the state and the adjoint variables, respectively, by (3.12) and 

[A(T) - cr4x*{T))][y - x*{T)] > 0, Vt/ G y. (3.42) 

Note also that Remark 3.2 applies here as well for the fixed-end-point 
problem. 



Example 3.2 Use the current-value maximum principle to solve the 
following consumption problem for p — r: 



max 



J e ^*lnC(t)dt| 



subject to the wealth dynamics 

W = rW-C, W(0) = Wo, W{T) = 0, 



where Wq > 0. Note that the condition W{T) = 0 is sufficient to make 
kU(t) > 0 for all t. We can interpret In C{t) as the utility of consuming 
at the rate C{t) per unit time at time t. 



Solution. In Exercise 2.14(a) you used the standard Hamiltonian for- 
mulation to solve the problem. We now demonstrate the use of the 
current-value Hamiltonian formulation: 



H = lnC + X{rW-C), 



(3.43) 



where the adjoint equation is 

\ = p\- — = p\-r\ = Q,\{T) = p, (3.44) 

since we assume p — r, and where j3 is some constant to be determined 
(see Exercise 2.13 and Remark 3.2 because of the fixed-end-point condi- 
tion W{T) — 0). The solution of (3.44) is simply A(t) = /? for 0 < t < T. 

To find the optimal control, we maximize H by differentiating (3.43) 
with respect to C and setting the result to zero: 



BH 

dC 



C 



- A:=0, 



which implies C = 1/A = 1//3. Using this consumption level in the 
wealth dynamics gives 
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which can be solved as 



W{t) = Woe’-‘ - 

pr 



!)• 



Setting W(T) ~0 gives 

^ rWo 

Therefore, the optimal consumption 



_ 1 _ rWo _ pWo 
~P~ 1 - “ 1 



since p ~ r. Note that the present value of this stream of consumption 
discounted at the rate p equals the initial wealth Wq. 

In Exercise 3.9 you are asked to solve this problem for p ^ r with 
the use of the current-value formulation. 

The interpretation of the current-value functions are that these func- 
tions reflect the values at time t in terms of the current (or, time-f) 
dollars. The standard functions, on the other hand, reflect the values at 
time t in terms of time-zero dollars. For example, the standard adjoint 
variable A^(i) can be interpreted as the marginal value per unit change 
in the state at time t in the same units as that of the objective function 
(3.26), i.e., in terms of time-zero dollars (see Section 2.2.4). On the other 
hand, A(^) — is obviously the same value given in current (or, 

time-f) dollars. 

For the consumption problem of Example 3.2, note that the current- 
value adjoint function X(t) = (1 — e'^'^)/rWo for all t. This gives the 
marginal value, in time-i dollars, of a imit increase in wealth at time 
t. In Exercise 2.14 the standard adjoint variable was A^(^) = e~^^(l — 
eP'^)JpWo^ which for Example 3.2 with p — r can be written as X^{t) = 
e~P\l—e~'^'^)/rWo = e~^^X{t). Thus, it is clear that X^{t) expresses the 
same marginal value in time-zero dollars. 

In Exercise 3.6, you are asked to formulate and solve a consumption 
problem of an economy. The problem is a linear version of the famous 
Ramsey model; see Ramsey (1928) and Feichtinger and Hartl (1986, 

p.201). 



3.4 Terminal Conditions 

Terminal conditions on the adjoint variables, also known as transversality 
conditions^ are extremely important in optimal control theory. Because 
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the salvage value function cf(x) is known, we know the marginal value per 
unit change in the state at the terminal time T. Since A(T) must be equal 
to this marginal value, it provides us with the boundary conditions for 
the differential equations for the adjoint variables. We shall now derive 
the terminal or transversality conditions for the current-value adjoint 
variables for some of the important special cases of the general problem 
treated in Section 3.3. We also summarize these conditions in Table 3.1. 

Case 1 Free- end point In this case we do not put any constraint on the 
terminal state x(T). Thus, 



x{T) G X. 

Prom the terminal conditions in (3.11), it is obvious that for the 
free-end-point problem, i.e., when Y = X, 

\{T) = (7^[x*{T)]. (3.45) 

This includes the condition A(T) = 0 in the special case of cf(x) = 0; see 
Example 3.1, specifically (3.18). These conditions are repeated in Row 
1 of Table 3.1. 

The economic interpretation of A(T) is that it equals the marginal 
value of a unit increment in the terminal state evaluated at its optimal 
value x*(T). 

Case 2 Fixed- end point In this case, which is the other extreme from 
the free-end-point case, the terminal condition is 

h{x{T),T)=x{T)-k^t), 

and the transversality condition in (3.11) does not provide any infor- 
mation for A(T). However, as mentioned in Remark 3.2 and recalled 
subsequent to (3.41), A*(T) will be some constant /3, which wiU be de- 
termined by solving the boundary value problem, where the differential 
equations system consists of the state equations with both initial and ter- 
minal conditions and the adjoint equations with no boundary conditions. 
This condition is repeated in Row 2 of Table 3.1. 

The economic interpretation of A(T') = is as foUows. The constant 
/? times 8k, i.e., p8k, provides the value that could be gained if the fixed- 
end point were specified to be fc -f 8k instead of k. See Example 3.2, 
especially (3.44). 
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Case 3 One-sided constraints. Here we restrict the ending value of the 
state variable to be in a one-sided interval, namely, 

a(x(T),T) = rr(T)-A;>0, 

where k £ X. In this case it is possible to show that 

A*(T) > (t^[x*(T)] (3.46) 

and 

{A(T) - a,[x*(T)]}{k - x*{T)} = 0. (3,47) 

For a(x) = 0, these terminal conditions can be written as 

A(T) > 0 and A(T)[ifc - x*{T)] = 0. (3.48) 

These conditions are repeated in Row 3 of Table 3.1. The conditions 
for the opposite case when the terminal constraint is 

x{T) -k<0 

are stated in Row 4 of Table 3.1. 



Case 4 A general case. A general ending condition is 

x{T) eYcx, 

which is already stated in (3.6). The transversality conditions are spec- 
ified in (3.42) and repeated in Row 5 of Table 3.1. 



Example 3.3 Consider the problem: 



max 




subject to 



X = u, cr(0) = 1, x{2) > 0, 
-1 < w < 1. 



(3.49) 

(3.50) 



Solution. The Hamiltonian is 



H — —X -j- Xu. 
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Clearly the optimal control has the form 



u* = bang[— 1, 1; A]. 


(3.51) 


The adjoint equation is 

A=1 


(3.52) 


with the transversality conditions 




A(2) > 0 and X{2)x(2) = 0, 


(3.53) 


obtained from (3.48) or from Row 3 of Table 3.1. Since A(t) is monoton- 
ically increasing, the control (3.51) can switch at most once, and it can 
only switch from tt* = — 1 to u* = 1. Let the switching time be t* < 2. 
Then the optimal control is 


-1 for 0 < 4 < t*. 

u-(t) = \ 

1 +1 for t* <t<2. 


(3.54) 



Since the control switches at t*, ^{i*) must be 0. Solving (3.52) we get 

A(t) =t — t*. 



There are two cases t* < 2 and t* = 2. We analyze the first case first. 
Here A(2) > 0; therefore from (3.53), x(2) = 0. Solving for x with u* 
given in (3.54), we obtain 



x(t) = 



( 



< 



l-t 

(t-t*)Tx{t*)=t + l-2t* 



for 0 < t < r, 

for t* <t <2. 



Therefore, setting x(2) = 0 gives 



x{2) = S-2t* = 0, 



which makes t* = 3/2. Since this satisfies t* < 2, we do not have to deal 
with the case t* = 2. Figure 3.1 shows the optimal state and adjoint 
trajectories. 

In Exercise 3.11, you are asked to rework Example 3.3 with the ter- 
minal condition x{2) >0 replaced by x(2) > 1. 
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Xy A 




Figure 3.1: State and Adjoint Trajectories in Example 3.3 

An important situation which gives rise to a one-sided constraint 
occurs when there is an isoperimetric or budget constraint of the form 

f l{x,u,t)dt < K, (3.55) 

J 0 

where I : x x E^ ^ E^ is assumed nonnegative, bounded, and 

continuously differentiable, and AT is a positive constant representing the 
amormt of the budget. To see how this constraint can be converted into 
a one-sided constraint, we define an additional state variable Xn-\-i by 
the state equation 



Xn+l = -l(x,u,t), x„+i(0) = K, Xn+l{T) > 0. 



(3.56) 
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We employ the index n + 1 simply because we already have n state 
variables x = {xi,X 2 , . . . , Xn). Also the above equation becomes an ad- 
ditional equation which is added to the original system. 

In Exercise 3.12 you will be asked to rework the leaky reservoir prob- 
lem of Exercise 2.12 with an additional isoperimetric constraint on the 
total amount of water available. See also Exercises 7.34 - 7.36. 

In Table 3.1, we have summarized all the terminal or transversality 
conditions discussed above. In Section 3.6 we discuss model types. We 
will see that, given the initial state xq, we can completely specify a 
control model by selecting a model type and a transversality condition. 

3.4,1 Examples Illustrating Terminal Conditions 

In the first example we will solve a variation of the consumption prob- 
lem in Example 3.2. It illustrates the use of one-sided transversality 
conditions in the current-value formulation. 

Example 3.4 Let us modify the objective function of the consumption 
problem (Example 3.2) to take into account the salvage (bequest) value of 
terminal wealth. This is the utility to the individual of leaving an estate 
to his heirs upon death. Let us now assume that T denotes the time 
of the individual’s death and BW(T)^ where 5 is a positive constant, 
denotes his utility of leaving wealth W (T) to his heirs upon death. Then, 
the problem is: 



max 




e~i’HnC{t)dt 



+ e-f'^BWiT) 



subject to the wealth equation 



W = rW -C, W{G) = Wo, W(T) > 0. 



(3.57) 



(3.58) 



Solution. The Hamiltonian for the problem is given in (3.43), and 
the adjoint equation is given in (3.44) except that the transversahty 
conditions are from Row 3 of Table 3.1: 



A(T) > H, [A(T) - B]W{T) = 0. (3.59) 

In Example 3.2 the value of /3, which was the terminal value of the adjoint 
variable, was 



rWo ■ 
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1 


Constraint 
on x(T) 


Description 


A(T) 


A(T) 

when <7=0 


■ 


x(T) £ Y = X 


Free-end 

point 




A(T) = 0 


1 






A(T) = a constant 
to be determined 


A(T) = a constant 
to be determined 


3 


x(T) G X n [fc,oo), 
i.e., Y = {x\x > k} 


One-sided 
constraints 
x(T) > k 


A(T) ><Tx[x*(T)] 
and 

{A(T) - a,[x*(T)]}{A - x*(T)} = 0 


A(T) > 0 
and 

A(T)[fc -x*(T)] =0 


4 


x(T) G X n (-oo.fc], 
i.e., Y = {x|x < fc} 


One-sided 
constraints 
x(T) < k 


A(T) <^^[x*(T)l 
and 

{A(T) -<7x[x*(T)]}{* -x*(T)} =0 


A(T) < 0 
and 

A(T)[fc -x*(T)] =0 


5 


x(T) G V C X 


General 

constraints 


{A(T) - a,[x*(T)]}{j/ - x*(T)} > 0 
Vy G y 


A(T)[y-x’(T)] > 0 
Vy 6 y 



Note 1. In Table 3.1, x{T) denotes the (column) vector of n state variables and 
A(T) denotes the (row) vector of n adjoint variables at the terminal time T. X G 
denotes the reachable set of terminal states obtained by using all possible admissible 
controls, y is a subset of X and k is an element of X. The function a [x(T)] : E'^ —>■ 
denotes the salvage value. In the case of an imspecified terminal time (i.e., when T is 
free), there is an additional transversal! ty condition (3.40), namely, 

H[x*{T*U*{T*),\{Tn,T*] -pa[x*(T*)] =0, 

which must be satisfied by an optimal T* along with the applicable condition in the 
table with T* replacing T everywhere. The symbol ♦ denotes the optimal values. 



Note 2. Table 3.1 will provide transversality conditions for the standard Hamiltonian 
formulation if we replace a with S, and reinterpret A as being the standard adjoint 
variable everywhere in the table. Also (3.14) is the standard form of (3.40). 

Table 3.1: Summary of the Transversality Conditions 



We now have two cases: (i) p > B and (ii) p < B. 

In case (i), the solution of the problem is the same as that of Example 
3.2, because by setting A(T) = K and recalling that W{T) = 0 in that 
example, it follows that (3.59) holds. 

In case (ii), we set A(T) = B and use (3.44) which is A = 0. Hence, 
\{t) = B for all t. The Hamiltonian maximizing condition remains 
unchanged. Therefore, the optimal consumption is 
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Solving (3.58) with this C gives 

W{t) = - 1). 

pr 

It is easy to show that 

W{T) = Woe’"^ - - 1) 

Pr 

is nonnegative since K < B. Note that (3.59) holds for case (ii). 

We shall next apply the maximum principle to solve a time- optimal 
control problem. It is one of the problems used by Pontryagin et al. 
(1962) to illustrate the applications of the maximum principle. The 
problem also elucidates a specific instance of the synthesis of optimal 
controls. 

By the synthesis of optimal controls we mean the procedure of “patch- 
ing” together various forms of the optimal controls obtained from the 
Hamiltonian maximizing condition. A simple example of the synthesis 
occurs in Example 2.4, where u* = 1 when A > 0, u* = — 1 when A < 0, 
and the control is singular when A = 0. An optimal trajectory starting 
at the given initial state variables is synthesized from these. In Example 
2.4 this synthesized solution is u* = — 1 for 0 < t < 1 and u* = 0 for 
1 < i < 2. Our next example requires a synthesis procedure which is 
more complex. In Chapter 5, both the cash management and equity 
financing models require such synthesis procedures. 



Example 3.5 A Time-Optimal Control Problem. Consider a subway 
train of mass m (assume m = 1), which moves along a smooth horizontal 
track with negligible friction. The position x of the train along the track 
at time t is determined by Newton’s Second Law of Motion 

X = u (3.60) 



with given initial conditions on 2 ;( 0 ) and i:(0) as 

x(0) — xq and a;(0) = yo? 



where u is an external controlling force applied to the train. The second- 
order differential equation (3.60) can be changed into a system of two 
first-order differential equations (see Appendix A) 



x = y, 



y = u 



x(0) = Xq, 
2/(0) = 2/0, 



(3.61) 
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where y{t) denotes the velocity of the train at time t. 

Assume further that for the comfort of the passengers the maximum 
acceleration and deceleration are required to be at most 1 measured in 
appropriate rniits. Thus, the control variable constraint is 

ueQ=[-l,l]. (3.62) 



The problem is to find a control satisfying (3.62) such that the train 
stops at the next station assumed to be located at a: = 0, i.e., x(T) = 0 
and y{T) = 0, in a minimum possible time T. We have thus defined the 
following optimal control problem: 



max 




subject to 



^ x = y, a:(0) = xq, x{T) = 0, 
y = u, y(0) = yo, y{T) = 0, 
and the control constraints 



(3.63) 



u £Cl= [-1, 1]. 



Note that (3.63) 
terminal time. 



is a fixed-end-point problem with unspecified 



Solution. The standard Hamiltonian function in this case is 



// = -! + Alt/ -1- A 2 U, 

where the adjoint variables Ai and A 2 satisfy 

Ai = 0, Ai(T) = Cl and A 2 = — Ai, A 2 (T) = C 2 , 

and Cl and C 2 are constants to be determined in the case of a fixed-end- 
point problem; see Row 2 of Table 3.1. We can integrate these equations 
and write the solution in the form 

Ai = Cl and A 2 = C 2 + ci(T — t), 

where ci and C 2 are constants to be determined from the maximum 
principle (2.31), condition (3.14), and the specified initial and terminal 
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(a) u*{r) = —1 for {t <r <T) 


(b) u*{r) = +1 for {t <r <T) 


y = T-t 
x = -(T-t)V2 
F“ : X = —2/^/2 for 2/^0 


y = t-T 
x = {t- r) 2/2 
F“*“ : X = 2/^/2 for 2/ < 0 



Table 3.2: State Trajectories and Switching Curves 



values of the state variables. The Hamiltonian maximizing condition 
yields the form of the optimal control to be 

u*(t) ~ bang{— 1, 1; C 2 + ci(T — t)}, (3.64) 

The transversality condition (3.14) with y(T) = 0 and S = 0 yields 
II -I- St = \2{T)u*{T) - 1 - C2U*{T) -1 = 0, 
which together with the bang-bang control policy (3.64) implies either 
\^{T) = C 2 = -1 and u*{T) = -1, 



or 

A 2 (T) = C 2 = +1 and u*{T) = +1. 

Since the switching function C 2 + ci (T — t) is a linear function of the 
time remaining, it can change sign at most once. Therefore, we have 
two cases: (a) w*(r) = — 1 in the interval t < r < T for some t > 0; (b) 
u*(r) — -i-1 in the interval t < r < T for some t > 0. We can integrate 
(3.61) in each of these cases as shown in Table 3.2. Also in the table we 
have the curves and F+, which are obtained by eliminating t from 
the expressions for x and y in each case. The parabolic curves F~ and 
F”*' are called switching curves and are shown in Figure 3.2. 

We can put T~ and F"*" into a single switching curve F as 



y = r(x) 



— \/2i, X > 0, 

-f\/— 2a:, X < 0. 



(3.65) 



If the initial state {xo,yo) lies on the switching curve, then we have 
u* = +1 (resp., u* = —1) if xo > 0 (resp., xq < 0); i.e., if (xo,j/o) lies 
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3 ^ 




Figure 3.2: Minimum Time Optimal Response for Problem (3.63) 



on r+ (resp., F“). If the initial state (xo,yo) is not on the switching 
curve, then we choose, between u* = 1 and u* = —1, that which moves 
the system toward the switching curve. By inspection, it is obvious that 
above the switching curve we must choose u* = —1 and below we must 
choose u* = +1. 

The other curves in Figure 3.2 are solutions of the differential equar- 
tions starting from initial points (a:o, yo)- If (^o, yo) hes above the switch- 
ing curve r as shown in Figure 3.2, we use w* = — 1 to compute the curve 
as follows: 

x = y, x(0) = xo, 
y = -1, y{0) = yo. 

Integrating these equations gives 



y= -t + yo, 

X = + yot + xo. 

Elimination of t between these two gives 



X — 



vo-y'^ 



2 



+ Xq. 



(3.66) 
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This is the equation of the parabola in Figure 3.2 through (xo,yo). The 
point of intersection of the curve (3.66) with the switching curve F’’’ is 
obtained by solving (3.66) and the equation for F”^, namely 2x = y‘^, 
simultaneously, which gives 

^ vi + 2xo ^ ^ -^{yl + 2xo)/2, (3.67) 

where the minus sign in the expression for y* in (3.67) was chosen since 
the intersection occurs when y* is negative. The time t* to reach the 
switching curve, called the switching time, given that we start above it, 
is 

r = w-2/* = yo + V(y^+2^^. (3.68) 

To find the minimum total time to go from the starting point (xq, yo) 
to the origin (0,0), we substitute t* into the equation for F"*" in Column 
(b) of Table 3.2; this gives 

r = i* - i/* = j/o + ^2{yl + 2xo). (3.69) 

As a numerical example, start at the point (1,1). Then, the equation 
of the parabola (3.66) is 

2x^3- y‘^. 

The switching point (3.67) is (3/4, —y/3j2). Finally, the switching time 
is = 1 + ^/3/2 from (3.68). Substituting into (3.69), we find the 
minimum time to stop is T = 1 + \/6. 

To complete the solution of this numerical example let us evaluate c\ 
and C 2 , which are needed to obtain Ai and A 2 . Since (1,1) is above the 
switching curve w(T) = 1, we have C 2 = 1. To compute we observe 
that C 2 +ci(T — t*) = 0 so that ci - —C 2 /(T — t*) = —1/ ^3/2 = — y^2/3. 

In Exercises 3.14 - 3.17, you are asked to work other examples with 
different starting points above, below, and on the switching curve. Note 
that t* = 0 by definition, if the starting point is on the switching curve. 

3.5 Infinite Horizon and Stationarity 

Thus far, we have studied problems whose horizon is finite or whose 
horizon length is a decision variable to be determined. In this section, 
we briefly discuss the case of T = 00 in the objective fimction (3.26), 
called the infinite horizon case. This case is especially important in many 
economics and management science problems. 
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When we put T = oo in the objective functions (3.2) or (3.26), we 
will generally get a nonstationary infinite horizon problem in the sense 
that the various fimctions involved depend explicitly on the time variable 
t. Such problems are extremely hard to solve. So, in this section we will 
devote our attention to only stationary infinite horizon problems, which 
do not depend explicitly on time t. Furthermore, it is reasonable to 
assume (j{x) = 0 in the infinite horizon case. 

When it comes to the transversality conditions in the infinite horizon 
case, the situation is somewhat more complicated. First, the limit of 
the condition in the rightmost column of Table 3.1 as T oo does not 
provide us with a necessary transversality condition in the infinite hori- 
zon case; see, e.g., Arrow and Kurz (1970). The limiting transversality 
conditions obtained by letting T — » oo in the present-value formula^ 
tion are sufficient for optimality however, provided the other sufficiency 
conditions stated in Theorem 2.1 are also met; see also Seierstad and 
Sydsaeter (1987) and Feichtinger and Hartl (1986). 

For the important free-end-point case, the following limiting transver- 
sality condition is obtained by letting T — oo in the present-value for- 
mulation: 

lim X‘(T) = 0 =i> lim e-f'^\{T) = 0. (3.70) 

T— >-oo T— >oo 

Another important case is that of one-sided constraints 

lim x(T) > 0. 

r^oo 

Then, the transversality conditions are 

^Um e-P^XiT) > 0 and X{T)x*{T) = 0. (3.71) 

In most management science problems, time t does not enter explic- 
itly in the functions 0, /, and g. The assumption of explicit indepen- 
dence of t is termed the stationarity assumption. More specifically, with 
respect to the problem treated in (3.26), (3.1), and (3.3), where ({) is al- 
ready independent of time, and without the terminal condition on x{T), 
the stationarity assumption implies 



This means that the state equations, the current- value adjoint equa^ 
tions, and the current-value Hamiltonian in (3.33) are all explicitly in- 
dependent of time t. Such a system is termed autonomous. 




f(x,u,t) = f(x,u), 
g(x,u,t) = g(x,u). 





82 



3. The Maximum Principle: Mixed Inequality Constraints 



In the case of autonomous systems, considerable attention is focused 
on equilibrium where aU motion ceases, i.e., the values of x and A for 
which X = 0 and A = 0. The notion is that of optimal long-run stationary 
equilibrium^ see Arrow and Kurz (1970, Chapter 2) and Carlson and 
Haurie (1987a, 1996). It is defined by the quadruple {^, u. A, p} satisfying 



(3.73) 



Clearly, if the initial condition xq = x^ the optimal control is u*{t) = u 
for aU t. 

If the constraint involving g is not imposed, p, may be dropped from 
the quadruple. In this case, the equilibrium is defined by the triple 
{x, w, A} satisfying 

f{x, u) = 0, pX — Hx{x, u, A), and Hu{x, u, A) = 0. (3.74) 

It may be worthwhile to remark that the optimal long-run stationary 
equilibrium (which is also called the turnpike) is not the same as the op- 
timal steady-state among the set of all possible steady- states. The latter 
concept is termed the Golden Rule or Golden Path in economics, and a 
procedure to obtain it is described below. However, the two concepts 
are identical if the discount rate p = 0. See Exercise 3.29. 

The Golden Path is obtained by setting x = f{x^u) = 0, which 
provides the feedback control u{x) that would keep x{t) = x over 
time. Then, substitute u[x) in the integrand 4>{x,u) of (3.26) to ob- 
tain 4){x,u{x)). The value of x that maximizes (j){x,u(x)) yields the 
Golden Path. Of course, all of the constraints imposed on the problem 
have to be respected in obtaining the Golden Path. 

Example 3.6 Let us return to Example 3.2 and now assume that we 
have a perpetual charitable trust with initial frmd Wq, which wants to 



f{x,u) = 0, 
pX = Lx[x,u,X,p], 

A > 0, pg{x,u) = 0, 
and 

H{x^ u, A) > H{x, u, A) 
for all u satisfying 
g(x,u) > 0. 
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maximize its total discounted utility of charities C{t) over time subject 
to the solvency condition that W{t) >0 for all t. It should be clear in 
this case that it suffices for the solvency condition to impose only the 
terminal condition 

lim W{T) > 0. (3.75) 

T— >oo 

Since the pure state constraint W(t) >0 need not be imposed here, we 
do not require the application of the maximum principle for general state 
constraints developed in the next chapter. 

For convenience we restate the problem: 

max I J = J e~'^^lnC{t)dt 



subject to 

W = rW- C, W{0) = Wo > 0, (3.76) 

and (3.75). We again assume p = r as in Example 3.2. 

Solution. By (3.73) we set 

rW -C = 0, \ = k, 

where fc is a constant to be determined. This gives the optimal control 
C = rW, and by setting \ — XfC = 1/rW, we see all the conditions 
of (3.73) including the Hamiltonian maximizing condition hold. Fur- 
thermore, A and W = Wq satisfy the transversality conditions (3.71). 
Therefore, by the sufficiency theorem, the control obtained is optimal. 
Note that the interpretation of the solution is that the trust spends 
only the interest from its endowment Wq. Note further that the triple 
(W, C, A) = (Wo, rWo, 1/rWo) is an optimal long-run stationary equilib- 
rium for the problem. 

3.6 Model Types 

Optimal control theory has been used to solve problems occurring in en- 
gineering, economics, management science, and other fields. In each field 
of application, certain general kinds of models which we will call model 
types are likely to occur, and each such model requires a specialized form 
of the maximum principle. In Chapter 2 we derived in considerable de- 
tail a simple form of the continuous-time maximum principle. However, 
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to continue to provide such details for each different version of the maxi- 
mum principle that we need in later chapters of this book would be both 
repetitive and lengthy. 

The purpose of this section is to avoid the latter by listing most 
of the different management science model types that we will use in 
later chapters. For each model type, we will give a brief description of 
the corresponding objective function, state equations, control and state 
inequality constraints, terminal conditions, adjoint equations, and the 
form of the optimal control policy. We will also indicate where each of 
these model types is applied in later chapters. 

The reader may wish to skim this section on first reading to get an 
idea of what it contains, work a few of the exercises, and go on to the 
various functional areas discussed in later chapters. Then, when specific 
model types are encoimtered, the reader may return to read in more 
detail the relevant parts of this section. 

We are now able to state the general forms of aU the models (with one 
or two exceptions) that we will use to analyze the applications discussed 
in the rest of the book. Some other model types will be explained in 
later chapters. 

In Table 3.3 we have listed six different combinations of (j) and / 
functions. If we specify the initial value xq of the state variable x and 
the constraints on the control and state variables, we can get a completely 
specified optimal control model by selecting one of the model types in 
Table 3.3 together with one of the terminal conditions given in Table 3.1. 

The reader will see numerous examples of the uses of Tables 3.1 
and 3.3 when we construct optimal control models of various applied 
situations in later chapters. To help in understanding these, we shall 
give a brief mathematical discussion of the six model types in Table 3.3, 
with an indication of where each model type will be used later in the 
book. 

(a) In Model Type (a) of Table 3.3 we see that both cf) and / are 
linear functions of their arguments. Hence it is called the linear-linear 
case. The Hamiltonian is 

H — Cx + Du + X(Ax -f Bu 4- d) 

= Cx + XAx yXd-^(D-\- XB)u. (3.77) 

Prom (3.77) it is obvious that the optimal policy is bang-bang with the 
switching function {D + XB). Since the adjoint equation is independent 
of both control and state variables, it can be solved completely without 
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Objective 

Function 

Integrand 

<t> = 


State 

Equation 

± ~ f - 


Current-Value 
Adjoint Equation 

A = 


Form of Optimal 
Control Policy 


(a) 


Cx + Dv. 


Ax 4- Bu + d 


A(p - A) - C 


Bang-Bang 


(b) 


C(x) + Du 


Ax + Bu + d 


\{p - A) - C:c 


Bang-Bang-I-Singular 


(c) 


x'Cx + u' Du 


A.X + Btb + d 


A(p - A) - 2x'C 


Linear Decision Rule 


(d) 


C(x) + Du 


A(x) + Bu + d 


X(p-A^) -C^ 


Bang-Bang+Singular 


(e) 


c{x) + q{u) 


(ox) + Bu + d 


A(p - ab(u) - e'(x)) - Ci 


Interior or Boundary 


(f) 


c(x)q(u) 


(ax)b(u) + e{x) 


A(p - ab(u) - e'(x)) - Cxq(u) 


Interior or Boundary 



Note. The current-value Hamiltonian is often used when p > 0 is the discount rate; 
the standard formulation is identical to the current-value formulation when p = 0. 
In Table 3.3, capital letters indicate vector functions and small letters indicate scalar 
functions or vectors. A fimction followed by an argument in parentheses indicates 
a nonlinear function; when it is followed by an argument without parenthesis, it 
indicates a linear function. Thus, A{x) and a{x) are nonlinear vector and scalar 
functions, while Ax and ax are Hnear. The function d is always to be interpreted as 
an exogenous function of time only. 

Table 3.3: Objective, State, and Adjoint Equations for Various Model 
Types 



resorting to two-point boundary value methods. Examples of (a) occur 
in the cash balance problem of Section 5.1.1 and the maintenance and 
replacement model of Section 9.1.1. 

(b) Model Type (b) of Table 3.3 is the same as Model Type (a) except 
that the function C{x) is nonlinear. Thus, the term Cx appears in the 
adjoint equation, and two-point boimdary value methods are needed to 
solve the problem. Here, there is the possibility of singular control; see 
Section 5.5. A specific example of Model Type (b) is the Nerlove- Arrow 
model in Section 7.1.1. 

(c) Model Type (c) has linear functions in the state equation and 
quadratic fimctions in the objective function. Therefore, it is sometimes 
called the linear- quadratic case. In this case, the optimal control can be 
expressed in a form in which the state variable enters linearly. Such a 
form is known as the linear decision rule (Table 3.3). A specific example 
of this case occurs in the production-inventory example of Section 6.1.1. 

(d) Model Type (d) is a more general version of Model Type (b) in 
which the state equation is nonlinear in x. The wheat trading model of 
Section 6.2.1 illustrates this model type. 

(e,f) In Model Types (e) and (f), the functions are scalar functions, 
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and there is only one state equation so that A is also a scalar function. In 
these cases, the Hamiltonian function is nonlinear in u. If it is concave 
in u, then the optimal control is usually obtained by setting Hu = 0. If 
it is convex, then the optimal control is as in Model Type (b) . 

Several examples of Model Type (e) occur in this book: the optimal 
financing model in Section 5.2.1; the extension of the Nerlove- Arrow 
model in Section 7.1.3; the Vidale- Wolfe advertising model in Section 
7.2.1; the nonlinear extension of the maintenance and replacement model 
in Section 9.1.4; the forestry model in Section 10.2.1; the exhaustible 
resource model in Section 10.3.1; and all the models of Chapter 11. 
Model Type (f) examples are: The Kamien- Schwartz model in Section 
9.2.1 and the sole-owner fishery resource model in Section 10.1. 

Although the general forms of the model are specified in Tables 3.1 
and 3.3, there are a number of additional modeling “tricks” that are 
useful, and will be employed later. We collect these as a series of remarks 
below. 

Remark 3.3 We sometimes need to use the absolute value function |u| 
of a control variable u in forming the functions cf) or f. For example, in 
the simple cash balance model of Chapter 5, u < 0 represents buying 
and u > 0 represents selling; in either case there is a transaction cost 
which can be represented as c\u\. In order to handle this we write the 
following equations: 

u := ^ > 0, u~ > 0, (3.78) 

u^u- = 0. (3.79) 

Thus, we represent u as the difference of two nonnegative variables, 
and u~ , together with the quadratic constraint (3.79). We can then write 

\u\=u^+u~, (3.80) 

and so have written the nonhnear function |w| as a linear function with 
a quadratic constraint (3.79). 

We now observe that we need not impose (3.79) explicitly, provided 
there is a cost associated with each of the variables and u~ , for 
because of the transaction cost, no optimal pohcy would ever choose to 
make both of them simultaneo'iisly ‘positive. (In the cash management 
example, because of transaction costs, we never simultaneously buy and 
seU the same security.) 




3.6. Model Types 



87 



Thus, by doubling the number of variables and adding inequahty 
constraints, we are able to represent |w| as a linear function in the model. 

Remark 3.4 Tables 3,1 and 3.3 are constructed for continuous-time 
models. Exactly the same kinds of models can be developed in the 
discrete- time case; see Chapter 8. 

Remark 3.5 Consider Model Types (a) and (b) when the control vari- 
able constraints are defined by hnear inequalities of the form 

g{u,t) = g(t)u > 0. (3.81) 

Then, the problem of maximizing the Hamiltonian frmction becomes: 

✓ 

max(D -h XB)u 

* subject to (3.82) 

g{t)u > 0. 

This is clearly a linear programming problem for each given instant of 
time t, since the Hamiltonian function is linear in u. 

Further in Model Type (a), the adjoint equation does not contain 
terms in x and u, so that we can solve for X{t), and hence the objective 
function of (3.82) varies parametrically with X{t). In this case we can 
use parametric linear programming techniques to solve the problem over 
time. Since the optimal solution to the linear program always occurs 
at an extreme point of the convex set defined by g{t)u > 0, it follows 
that as A(i) changes, the optimal solution to (3.82) will “bang” from one 
extreme point of the feasible set to another. This is called a generalized 
hang-bang optimal policy. Specific examples of this kind of policy are 
given in the wheat trading models of Chapter 6. 

In Model Type (b), the adjoint equation contains terms in x, so that 
we cannot solve for the trajectory of A(^) without knowing the trajectory 
of x{t). It is still true that (3.82) is a linear program for any given i, 
but the parametric linear programming techniques will not usually work. 
Instead some type of iterative procedure is needed in general; see Bryson 
and Ho (1969). 

Remark 3.6 The salvage value part of the objective function, 
S[x{T),T], makes sense in two cases: 
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(a) When T is free, and part of the problem is to determine the 
optimal terminal time (see, e.g., Section 9.1). 

(b) When T is fixed and we want to maximize the salvage value 
of the ending state x{T), which in this case can be written simply as 
S[x(T)]. 

For the fixed-end-point problem and for the infinite horizon problem, 
it does not usually make much sense to define any salvage value function. 

Remark 3.7 One important model type that we did not include in Ta- 
ble 3.3 is the impulse control model of Bensoussan and Lions (1975). In 
this model, an infinite control is instantaneously exerted on a state vari- 
able in order to cause a finite jump in its value. This model is particularly 
appropriate for the instantaneous reordering of inventory as required in 
lot-size models; see Bensoussan et al. (1974). Further discussion of im- 
pulse control is given in Section 12.3. 



EXERCISES FOR CHAPTER 3 

3.1 Consider the constraint set 

n = {(wi, W2)|0 <x, —1<U2< ui}. 

Write these in the form shown in (3.3). 

3.2 Find the reachable set X defined in Section 3.1, if :r and u satisfy 

X = u — 1, xo = h, —l<u<l, 

and T = 3. 

3.3 Check that the solution of Example 3.1 satisfies the sufficiency 
conditions in Theorem 3.1. 

3.4 Rework Example 3.3 with T = 4 and the following different termi- 
nal conditions: 



(a) a;(4) = 1. 

(b) x(4) < 1. 

(c) x(4) unconstrained. 
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3.5 Derive the transversal! ty condition (3.14) for the case of unspecified 
terminal time by first solving the problem stated in the Mayer form 
for a fixed time T to obtain the optimal value J*(T) of the objective 
function, and then maximizing J*{T) with respect to T by using 
the conditions dJ*{T)/dT = 0; see Hartl and Sethi (1983). 

3.6 Optimal Consumption of An Initial Investment Over a Finite Hori- 
zon. Begin with an initial investment of xq. Assets x{t) at time 
t earn at the rate of r per dollar per unit time. A part of the 
earnings is consumed, while the remainder is invested. Negative 
consumption rate and a consumption rate exceeding the earnings 
are not allowed. Assets depreciate at the constant rate 6. Assume 
r > 6 A- p, where p is the discount rate applied on consumption. 
Find the optimal consumption rate over a finite horizon T such 
that the present value of the consumption stream over the finite 
horizon is maximized. Assume that T is sufficiently large. 

3.7 Develop the current-value formulation of Section 3.3 for a time- 
varying nonnegative discount rate p{t), by replacing the factors 

and in (3.26), respectively, by 

a{t) = e- /o and a{T) = e“ fo 

3.8 Starting from (3.14), derive its current-value version (3.40). 

3.9 Re-solve Examples 3.2 and 3.6 when r ^ p, by using the current- 
value formulation. 

3.10 Show that the current-value Hamiltonian (3.33) is explicitly inde- 
pendent of time, if / is exphcitly independent of time. Show that 
dH/dt = pXf and contrast this result with that of Exercise 2.11. 

3.11 Rework Example 3.3 with the terminal condition (3.49) replaced 
by x{2) > 1. 

3.12 Recall Exercise 2.12 of the leaky reservoir in Chapter 2. In this 
problem there was no explicit constraint on the total amount of 
water available. Suppose we impose the following isoperimetric 
constraint on that problem: 
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where X > 0 is the total amount of water which must be used. 
Assume also that the reservoir has infinite capacity. Re-solve this 
problem for various values of K and the objective fimctions in parts 
(a) and (b) of Exercise 2.12. 

3.13 Introduce a terminal value in Example 3.3 as follows: 

max {■'-/ {—x)dt + Bx{2) 

subject to 

X = u, x(0) — 1, 

x(2) > 0, i.e., Y = [0, oo) in Row 3 of Table 3.1, 

-1 < w < 1. 

Note that for 5 = 0, the problem is the same as Example 3.3. 
Solve this problem for B = 1/2, 1, 3/2, 2, 3. Conclude that for 
5 > 2, the solution for the state variable does not change. 

3.14 In Example 3.5, determine the optimal control and the cor- 
responding state trajectory starting at the point (—4,6), 
which lies above the switching curve. 

3.15 Carry out the synthesis of the optimal control for Example 3.5 
when the starting point (xo,yo) lies below the switching curve. 

3.16 Use the results of Exercise 3.15 to find the optimal control and the 
corresponding trajectory starting at the point (-1,-1). 

3.17 Find the optimal control, the minimum time, and the correspond- 
ing trajectory for Example 3.5 starting at the point (—2, 2), which 
lies on the switching curve. 

3.18 What is the shortest time in which a passenger can be transported 
in a ballistic missile from Los Angeles to New York? Assume that 
a missile with the ultimate mechanical and thermodynamical prop- 
erties is available, but that the passenger imposes the restraint that 
the maximum acceleration or deceleration is 100 ft/sec^. The mis- 
sile starts from rest in Los Angeles and stops in New York. Assume 
that the path is a straight line of length 2400 miles and ignore the 
rotation and curvature of the earth. 
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3.19 Solve the following minimum weighted energy and time problem: 
subject to 

X = u, x(0) 5, x(T) = 0, 

and the control constraint 

|w| < 2. 

[Hint. Use (3.14) to determine T*, the optimal value of T] 

3.20 Rework the previous exercise with the new integrand F = 
—{l/2)(u^ + 16) in the objective function. 

[Hint. Note that use of (3.14) gives an infeasible u. Instead calcu- 
late J*{T) as defined in Exercise 3.5, and then choose T to maxi- 
mize it. In doing so, take care to see that both x(T) = 0 and the 
control constraint are satisfied.] 

3.21 Exercise (3.20) becomes a minimum energy problem if we set F = 
— u^/2. Show that the Hamiltonian maximizing condition of the 
maximum principle implies u* = fc, where A; is a constant. Note 
further that the application of (3.14) implies that fc = 0, which 
gives x{t) = 5 for alH > 0 so that the terminal condition x(T) = 0 
cannot be satisfied. 

To see that there exists no optimal control in this situation, let 
A; < 0 and compute J * . It is now possible to see that limfe_^o J* = 0- 
This means that we can make the objective function value as close 
to zero as we wish, but not equal to zero. Note that in this case 
there are no feasible solutions satisfying the necessary conditions so 
we cannot check the sufficiency conditions (see the last paragraph 
of Section 2.1.4). 

3.22 Show that every feasible control of the problem 

max <J= —udt > 

T,u [Jo j 

subject to 

X = u, a:(0) = xo, x(T) = 0, 
fy| < q-i where g* > 0, 



is an optimal control. 
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3.23 Let 0^0 > 0 be the initial velocity of a rocket. Let u be the amount 
of acceleration (or deceleration) caused by applying a force which 
consumes fuel at the rate |n|. We want to bring the rocket to rest 
using minimum total amount of fuel. Hence, we have the following 
optimal control problem: 



subject to 



max 

T,u 




X = u, o:(0) = xo, x(T) = 0, 
-1 <u < +1, 



[Hint. Use (3.78)-(3.80) to deal with |w|. Show that for xq > 0, say 
xo = 5, every feasible control is optimal.] 

3.24 Analyze Exercise 3.23 with the state equation 

X = ~ax + li, 



where a > 0. Show that no optimal control exists for the problem. 

3.25 Prom the transversahty conditions for the general terminal con- 
straints in Row 5 of Table 3.1, derive the transversahty conditions 
in Row 1 for the free-end-point case, in Row 2 for the fixed-end- 
point case, and in Rows 3 and 4 for the one-sided constraint cases. 
Assume ct[x) — 0, i.e., there is no salvage value and X = for 
simphcity. 

3 . 26 * An example, which illustrates that 



hm A(U = 0 

t-^oo ^ ' 



is not a necessary transversahty condition in general, is: 



max 



J — j (1 — 



such that 

X = (I — x)u, x(0) = 0, 

0<u< 1. 



Show this by finding an optimal control. 
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3.27 Consider the regulator problem defined by the scalar equation 

X = a:(0) = xq, 



with the cost 




Show that the optimal feedback control is 



— X >0, 

u*{x) = \ 

X < 0, 

where a feedback control determines the control variable as a func- 
tion of X. 

3.28 Assume the constraint (3.3) to be of the form g{u^ t) > 0, i.e., g 
does not contain x explicitly, and assume x{T) is free. Apply the 
Lagrangian form of the maximum principle and derive the Hamil- 
tonian form (2.31) with 

Q{t) ^ {u\g{u,t) > 0}. 

Assmne g{u^ t) to be of the form a < u < {3. 

3.29 Consider the inventory problem: 

max I J = -e“'’*[(7 - hf + {P - Pif\dt 

subject to 

i = p~s, /(o) = /o, 

where I denotes inventory level, P denotes production rate, and S 
denotes a given constant demand rate. 

(a) Find the optimal long-run stationary equilibrium, i.e., the 
turnpike defined in (3.73). 

(b) Find the Golden Rule by setting 7 = 0 in the state equation, 
solve for P, and substitute it into the integrand of the objec- 
tive fimction. Then, maximize the integrand with respect to 

7. 
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(c) Verify that the Golden Rule inventory level obtained in (b) 
is the same as the turnpike inventory level foimd in (a) when 

p = 0. 

3.30*Use the Lagrangian form of the maximum principle to obtain the 
optimal control for the following problem: 

max{J == ici(2)} 



subject to 



Xl{t) =Ui- U2, cci(O) = 2, 



X2{t) = U2, :T2(0) = 1, 

and the constraints 



xi{t) > 0, X 2 {t) > 0, 0 < ui{t) < X 2 {t), 0 < U 2 {t) <2, 0 < t < 2. 

An interpretation of this problem is that xi{t) is stock of steel at t 
and X 2 {t) is the total capacity of steel mill at time t. Production of 
steel at rate u\ , which is bounded by the current steel mill capacity, 
can be split into U 2 and ui~U 2 , where goes into increasing the 
steel mill capacity and ui — U 2 adds to the stock of steel. The 
objective is to build as large a stockpile of steel as possible by time 
T = 2. (It is possible to make the problem more interesting by 
assuming an exogenous demand d for steel so that xi = ui—U 2 —d.) 

3.31 Speciahze the terminal condition (3.12) in the one-dimensional case 
(i.e., n — 1) when Y = [x,x], where x and x are two constants 
satisfying x > x. 

3.32 By using the maximum principle, show that the problem 

C 

max / xdt 

Jo 

subject to 

^ X = X + u^ rc(0) = 0, 

-1 < w< 1, 



X + u <2 
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has the optimal control 







l + 21n2-2t, 



t G [0,ln2], 
t G (ln2,l]. 



3.33 Solve the problem; 



max I J = f [—2 + (1 — u(t))x(t)]dt 
u,T Jq 



subject to 

X = u, x(0) = 0, x{T) > 1, 
ue[0, 1], 

Tg[1,8]. 

3.34 Consider the problem: 

max i J = / [—3 — u(t) + x(i)]dt 
u,T y Jo 



subject to 

X = u, o:(0) = 0, x(T) > 1, 

U e [0,1], 

T€[1,4 + 2\/2]. 

The problem has two different optimal solutions with different val- 
ues for optimal T*. Find both of these solutions. 
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The Maximum Principle: 
General Inequality 
Constraints 



In Chapter 2 we addressed optimal control problems having constraints 
on control variables only. We extended the discussion in Chapter 3 to 
include constraints that may involve state variables in addition to control 
variables. Such constraints were called mixed inequality constraints. 

Often in management science and economics problems there are non- 
negativity constraints on state variables, such as inventory levels or 
wealth. These constraints do not include control variables. Also, there 
may be more general inequality constraints only on state variables, which 
include nonnegative constraints. Such constraints are known as pure 
state variable inequality constraints or, simply, pure state constraints. 

These constraints are more difficult to deal with than the mixed con- 
straints. Pure state variable inequality constraints together with the 
mixed constraints, if any, are to be considered in the present chapter. 
The difficulty with pure state constraints arises from the fact that when 
such a constraint is tight, it does not provide any direct information 
on how to choose values for control variables that will not violate the 
constraint. Indeed, the choice of controls that give feasible values to the 
state variables come instead from restricting the values of the derivatives 
of the tight state constraints with respect to time. These derivatives will 
have time derivatives of the state variables, which can be written in terms 
of the control and state variables through the use of the state equations. 
Thus, the restrictions on the time derivatives of the pure state constraints 
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are transformed in the form of mixed constraints, and these will be ap>- 
pended to the Hamiltonian to form the Lagrangian. Because the pure 
state constraints are adjoined in this indirect fashion, the corresponding 
Lagrange multipliers must satisfy some complementary slackness condi- 
tions in addition to those mentioned in Chapter 3. For the same reason, 
the above procedure is known as the indirect adjoining method. 

With this formulation of the Lagrangian, we will write the maxi- 
mum principle where the choice of control will come from maximizing 
the Hamiltonian subject to both direct, if they are present, and mixed 
constraints mentioned above. Moreover, the adjoint functions may be 
required to have jumps at those times where the pure state constraints 
become tight. 

Such constraints will be encountered for instance in Section 5.1.3, 
Section 6.2.4, and Section 6.3. Because of the difficulty of the present 
chapter the reader may wish to skim it for now and read it in detail later. 

In Section 4.1, we describe the indirect method for solving such prob- 
lems, which works by adjoining the first derivative of the pure state con- 
straints to the Lagrangian function and imposing some additional con- 
straints on the Lagrange multipliers of the resulting formulation. The 
indirect method may also involve imposing jump conditions on the ad- 
joint variable as described in Section 4.1.1. These ideas are put together 
to give a general maximum principle for the indirect method in Section 
4.2. The cur rent- value form of that maximum principle is described in 
Section 4.3. 



4.1 Pure State Variable Inequality Constraints: 
Indirect Method 

In many management science and economics problems, it is common to 
require state variables such as an inventory level or wealth to remain 
nonnegative, i.e., to impose the state constraint of the form 

x{t) > 0 for ^ G [0, T], (4.1) 

i.e., Xi(t) > 0, « = 1, 2, ..., n. Note that the control variables do not enter 
directly into (4.1). Constraints exhibiting this property are called pure 
state variable inequality constraints^ and they generally can be expressed 
in the form 



h{x,t) > 0 for i € [0,T]. 



(4.2) 
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It should be clear that the nonnegativity state constraints such as (4.1) 
are a special case of (4.2). The reader is referred to Seierstad and Syd- 
sseter (1987), Feichtinger and Hartl (1986), and Hartl, Sethi, and Vickson 
(1995) for detailed discussion on such constraints. 

Mainly there are two different but related approaches to deal with 
pure state inequality constraints. These are called direct and indirect 
adjoining methods. The reason for this terminology will become clear 
later in this section. 

In this book, we have chosen to follow the indirect adjoining method. 
As for the direct adjoining method and its relation to the indirect 
method, the reader is referred to Hartl, Sethi, and Vickson (1995) and 
Feichtinger and Hartl (1986). 

First, we discuss the special case of the nonnegativity constraint (4.1). 
The general case involving (4.2) is treated in Section 4.2. At any point 
where a component Xi(t) > 0 , the corresponding constraint Xi(t) > 0 
is not binding and can be ignored. In any interval where Xi(t) = 0 , we 
must have Xi(t) > 0 so that Xi does not become negative. Hence, the 
control must be constrained to satisfy Xi ^ fi > 0 , making /j > 0 as 
a constraint of the mixed type (3.3) over the interval. We can add the 
constraint 

fi{x,u,t) > 0, whenever Xi{t) = 0, (4-3) 

to the original set (3.3). We associate multipliers with (4.3) whenever 
(4.3) must be imposed, i.e., whenever Xi(t) = 0. A convenient way to 
do this is to impose an “either or” condition %Xi = 0. This will make 
7 ^^ = 0 whenever Xi > 0. We can now form the Lagrangian 

L = H + i,g + vf, (4.4) 

where the Hamiltonian H is as defined in (3.7) and 77 = (??i5??25 •••5^n)j 
and apply the maximum principle in (3.11) with the additional necessary 
conditions satisfied by the multiplier 77 , namely, 

^ > O 5 = 0, ^ ^ 0, (4-5) 

(see Remark 4.1) and the modified transversahty condition 

A(T) = 5,(a:*(T),T) +aa,(o;*(r),r) +^5,(x*(T),T) +7 (4-6) 

on the adjoint variable A, where 7 is a constant vector satisfying 



7 >0, -fx*(T) = 0. 



( 4 . 7 ) 
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Since the constraints are adjoined indirectly (in this case via their 
first time derivative) to form the Lagrangian, the method is called the 
indirect adjoining method. If on the other hand, the Lagrangian L is 
formed by adjoining directly the constraints (4.1), i.e., 

= H ^ pg 

where z/ is a multiplier associated with (4.1), then the method is referred 
to as the direct adjoining method. It should be noted that and v 
of the direct adjoining method are, in general, different from \p and 
77 of the indirect adjoining method. However, as will be seen in Section 
4.4, there is a relationship that holds between the two sets of multipliers. 
The reader is referred to Hartl, Sethi, and Vickson (1995) for details. 

Remark 4.1 The first two conditions in (4.5) are complementary slack- 
ness conditions on the multiplier 77. The last condition 77 < 0 is difficult 
to motivate. We know, however, that the direct maximum principle mul- 
tiplier is related to 77 as z/ = — 77 ; see Hartl, Sethi, and Vickson (1995). 
The complementary slackness conditions for the direct multiplier y are 
y >0 and yx* = 0. Since z/ > 0, it follows that 77 < 0. 

Example 4.1 Consider the problem: 

max|j = J {—x)dt 

subject to 

X = u^ x(0) = 1, (4.8) 

7 / + l>0, l-7z>0, (4.9) 

x>0. (4.10) 

Note that (4.9) is a restatement of —1 < u < 1. Note further that 
this problem is the same as Example 2.2, except for the nonnegativity 
constraint (4.10). 

Solution. The Hamiltonian is 



H ~ ~x + Au, 



which implies the optimal control to be 



u* = bang[— 1, 1; A], whenever rr > 0. 



(4.11) 
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When X = 0, we impose x = u > 0 in order to insure that (4.10) holds. 
Therefore, the optimal control on the state constraint boundary is 

u* ~ bang[0, 1 ; A], whenever x = 0. (4-12) 

Now we form the Lagrangian 

L = H P + 1) + - w) + T]U, 



where V satisfy the complementary slackness conditions 





Mi>0, ^i(w + l)= 0 , 


(4.13) 




> 0 , ^ 2(1 - “w) = 0 , 


(4.14) 




77 > 0 , 7 ?x = 0, 57 < 0. 


(4.15) 


Furthermore, the optimal trajectory must satisfy 






dL 


(4.16) 




— = A + ^ = 0- 


Prom the Lagrangian we also get 




A = -^ = l 
dx ’ 


A(2) = 7 > 0, 7x(2) = A(2)a:(2) = 0. 


(4.17) 


Let us first try A(2) = 


: 7 = 0. Then, the solution for A is the same as in 


Example 2.2, namely, 


\{t) = t-2. 


(4.18) 



Since \{t) < —1 on [ 0 , 1 ] and x( 0 ) = 1 > 0 , the initial optimal control 
given by (4.11) is u*{t) = — 1 . Substituting this into (4.8) we get x(t) = 
1 — t, which is positive for t < 1. Thus, 



u*(t) ~ —1 for 0 < t < 1. 

In the time interval [0,1) by (4.14), fi 2 = 0 since u* < 1, and by (4.15) 
T} = 0 because x > 0. Therefore, — — A(t) = 2 — t>0for0<t< 1, 
and this with u = —I satisfies (4.13). 

At t = 1 we have x(l) = 0 so the optimal control is given by (4.12), 
which is it*(l) = 0. Now assume that we continue to use the control 
u*{t) = 0 in the interval 1 < £ < 2. With this control we can solve 
(4.8) beginning with x(l) = 0, and obtain x(t) = 0 for 1 < t < 2. Since 
^(^) £ 0 in the same interval, we see that u*(t) = 0 satisfies (4.12) 
throughout this interval. 
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To complete the solution, we calculate the Lagrange multipliers. 
Since u*(t) = 0 on t G [1,2], we have pi(t) = M 2 (^) ~ ^ throughout 
in interval [1,2]. Then, from (4.16) we obtain rf(t) ~ — A(t) = 2 — ^ > 0 
which, with x(t) = 0 satisfies (4.15) on t G [1,2]. This completes the 
solution. The graphs of x(t) and X(t) are shown in Figure 4.1. 



X, A 




Figure 4.1: State and Adjoint Trajectories in Example 4.1 
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It should be obvious that if the terminal time were T = 1.5, the 
optimal control would be 



u*{t) = 

You are asked in Exercise 4.8 to redo the above calculation in this case 
and show that one now needs to have 7 = 1 / 2 . 

In Exercise 4.2, you are asked to solve a similar problem with F — —u. 

Remark 4.2 Example 4.1 is a problem instance in which the state con- 
straint is active at the terminal time. In instances where the initial state 
or the final state or both are on the constraint boundary, the maximum 
principle may degenerate in the sense that there is no nontrivial solution 
of the necessary conditions, i.e., \{t) ^ 0, ^ G [0, T], where T is the termi- 
nal time. See Arutyunov and Aseev (1997) or Ferreira and Vinter (1994) 
for conditions that guarantee a nontrivial solution for the multipliers. 

Remark 4.3 As will be seen in Exercise 4.13, Example 4.1 is a problem 
instance in which multipliers A and are not unique. The phenomenon 
is related to the fact that we are able to obtain a continuous A at the 
time t — \ when the state variable enters the boimdary of the constraint 
X > 0. In general, a jump in A may be required at the entry time, as 
discussed next in Section 4.1.1. For references dealing with the issue 
of nonuniqueness of the multipliers and with conditions under which 
the multipliers are imique; see Kurcyusz and Zowe (1979) and Shapiro 
(1997). 

4.1.1 Jump Conditions 

In Example 4.1, we were able to obtain a continuous adjoint fimction 
X(t); see Exercise 4.13. This may not always be possible in the presence 
of constraints (4.1). In the next example, we shall see that we need a 
piecewise continuous X{t) satisfying a jump condition 

A(r^ ) = A(r+) + C(r), CW > 0 (4.19) 



- 1 , le[ 0 ,l), 
0, iG [1,1.5]. 



at a time r at which the state trajectory hits its boimdary value zero. 
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Example 4.2 Consider Example 4.1 with T = 3 and the terminal state 
constraint 



x(3) - 1. (4.20) 

Clearly, the optimal control u* will be the one that keeps x as small as 
possible, subject to the state constraint (4.1) and the boundary condition 
x(0) = x(3) = 1. Thus, 






-1, ie[0,i), 

0, t€[l,2], 

1, t€(2,3]. 



(4.21) 



For brevity, we will not derive the optimal policy from the various nec- 
essary optimality conditions as we did in Example 4.1. Rather, we shall 
only compute the adjoint function and the multipliers that satisfy the 
optimality conditions. These are 



_ f f- 1 , ie[o,i], 
X(t) = f 

[ t-2, Ie(i,3], 


(4.22) 


o' 

II 

II 

rH 


(4.23) 


1 

11 


(4.24) 


1 > 0 SO that 




A(l-) = A(l+)+C(l). 


(4.25) 



4.2 A Maximum Principle: Indirect Method 

In this section, we state a maximum principle for problems involving 
mixed inequality constraints (3,3) and the pure state variable inequality 
constraints 

h(x,t)>0, (4.26) 

where we assmne fimction h : E” x to be as many times 

continuously differentiable as required. By the definition of function h, 
(4.26) represents a set of p constraints hi(x^t) > 0, z = 1,2, ...,p. It is 
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noted that the constraint /i* > 0 is called a constraint of rth order if the 
rth time derivative of hi is the first time a term in control u appears 
in the expression by putting f{x,u,t) for x after each differentiation. It 
is through this expression that the control acts to satisfy the constraint 
hi > 0. The value of r is referred to as the order of the constraint. 

In this book we shall consider only first-order constraints, i.e., r = 1. 
A method for adjoining higher-order constraints to form the Lagrangian 
is given in Bryson and Ho (1969) and Hartl, Sethi, and Vickson (1995). 
See also Exercise 4.10. 

In the case of first-order constraints, we need to define h}{x^u^t) as 



follows: 



h^ 



dt 



dh ^ 
dx^'^ dt' 



(4.27) 



As in Chapter 3, the constraints (4.26) need also to satisfy a full- 
rank type constraint qualification before a maximum principle can be 
derived. With respect to the zth constraint hi(x,t) > 0, an interval 
(^ 1 , ^ 2 ) C [0, T] with 9i < $2 is called an interior interval’ll hi{x(t),t) > 0 
for all t G {01,02)- If tfie optimal trajectory “hits the boundary,” i.e., 
satisfies hi{x{t), t) = 0 for <t< T 2 for some i, then [ti,T 2 ] is called 
a boundary interval. An instant ti is called an entry time if there is 
an interior interval ending at t = ri and a boimdary interval starting 
at T\. Correspondingly, T 2 is called an exit time if a boundary interval 
ends and an interior interval starts at T 2 . If the trajectory just touches 
the boundary at time r, i.e., h{x{r),T) = 0 and if the trajectory is in 
the interior just before and just after r, then r is called a contact time. 
Taken together, entry, exit, and contact times are called junction times. 
Throughout the book, we shall assume that the constraint qualification 
introduced in Section 3.1 as weU as the following full-rank condition on 
any boundary interval [rj , T 2 ] hold: 



dh\/du 

dh\/du 

rank = p, 

dhy du 



where for t G [ti,T 2 ], 



hi{x*{t),t) = 0, z = 1,2,..., p<p 
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and 

hi(x*(t),t) >0, i=p-hl, p. 

Note that this full-rank condition on the constraints (4.26) is written 
when the order of each of the constraints in (4.26) is one. For the general 
case of higher-order constraints, see Feichtinger and Hartl (1986) and 
Hartl, Sethi, and Vickson (1995). 

To formulate the maximum principle for the problem defined by the 
state equation (3.1), the objective function (3.2), mixed inequality con- 
straints (3.3), terminal state constraints (3.4) and (3.5), and first-order 
pure state variable inequality constraints (4.26), we form the Lagrangian 
as 



L(x, u, A, ? 7 , t) = H(x^ li, A, t) + pg(xj u, t) + (x, w, t ) , (4.28) 

where the Hamiltonian 



H = F(x, w, t) + A/(j;, u, t) 

as defined in (3.7), p satisfies the complementary slackness conditions 

> 0, pg{x^ u, t) = 0 

as stated in (3.9), and q £ E'^ {a row vector) satisfies the conditions 

7 ? > 0, qh{x,t) = 0, q <0. 

Note that these conditions on the multiplier q are a generalization of 
(4.5), for the pure state constraints of the type (4.26) replacing the non- 
negativity constraints (4.1). 

For the problem under consideration in this section, we shall now 
state the maximum principle which includes the discussion above and 
the required jump conditions. For details, see Pontryagin et al. (1962), 
Feichtinger and Hartl (1986, p.l70), Clarke and Loewen (1987), Hartl, 
Sethi, and Vickson (1995), and references therein. 

The maximum principle states that the necessary conditions for u* 
(with the state trajectory x*) to be an optimal control for the prob- 
lem defined above are that there exist adjoint variable A, multipliers 
/Li, a, /3, 7 , 77 , and the jmnp parameter which satisfy (4.29) that follows: 
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X* = f{x*^u*,t), o;*(0) = Xq, 
satisfying constraints 
g{x*,u*,t) > 0 , h(x*,t) > 0 , and 
the terminal constraints 
a{x*{T),T) > 0 and b{x*{T),T) = 0; 

A = -Lx[x*,u*, A,/^,? 7 ,t] 
with the transversality conditions 
A(T-) = S,{x*{T),T) + aa^{x*{T),T) + pb,(x*{T),T) 
-\-'yhx{x*{T),T), and 

a > 0, aa{x*(T),T) = 0, 7 > 0, 'yh{x*(T),T) = 0; 
the Hamiltonian maximizing condition 
Hlx*(t),u*{t), > H[x*(t), u, 

at each t G [0, T] for all u satisfying (4.29) 

g[x*(t),u, t] > 0 , and 

hj(x*{t),u, t) >0 whenever hi{x*(t)^ t) = 0 , z = 1 , 2 , • • • ,p; 
the jump conditions at any entry/contact time r are 
A(r“) = A(r+) + C{'^)hx{x*{r),T) and 
H[x*{r), w*(r' ), A(r-), t] = H[x*{r),u*{r+), A(r+), r] 

~(^{r)ht{x*{r),ry, 
the Lagrange multipliers are such that 
= 0, dH/dt = dL/dt = dL/dt, 
and the complementary slackness conditions 
p{t) > 0, p{t)g{x*, u*, t) = 0, 

7](t) > 0, f){t) < 0, Tj{t)h(x*(t),t) = 0, and 

C('T') > 0, 0 hold- 
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Note that the jump conditions on the adjoint variables in (4.29) gen- 
eralize the jump condition (4.25) encountered in Example 4.2. The jump 
condition on H in (4.29) requires that the Hamiltonian should be con- 
tinuous at r if /it = 0. The continuity of the Hamiltonian (in case hf = 0) 
makes intuitive sense when considered in the light of its interpretation 
given in Section 2.2.4. Example 5.1 is an instance where the jump con- 
ditions apply; see also Section 6.2.4. 

Hartl, Sethi, and Vickson (1995) and Feichtinger and Hartl (1986) 
also add the following condition 

C{-ri) > nirf) (4-30) 

at each entry time rj to the maximum principle necessary conditions 
(4.29) . This condition could also be added to the current- value maximum 
principle (4.42) stated later in Section 4.3. 

Remark 4.4 Condition (4.30) is equivalent to the nonnegativity of the 
jump multiplier arising in the direct maximum principle. For further 
discussion on this condition, see McIntyre and Paiewonsky (1967), 
Jacobson, Lele, and Speyer (1971), Taylor (1972), and Kriendler (1982). 

This brief discussion of the jump conditions, limited here only to 
first-order pure state constraints, is far from complete, and a detailed 
discussion is beyond the scope of this book. An interested reader should 
consult the comprehensive survey by Hartl, Sethi, and Vickson (1995). 
For an example with a second-order state constraint, see Maurer (1977). 

Needless to say, computational methods are required to solve prob- 
lems with general inequaUty constraints in all but the simplest of the 
cases. The reader should consult the excellent book by Teo, Goh, and 
Wong (1991) and references therein for computational procedures and 
software. See also Polak, Yang, and Mayne (1993), Bulirsch and Kraft 
(1994), Bryson (1998), and Pytlak and Vinter (1993, 1999). 

Example 4.3 Consider the following problem with the discount rate 

p>0: 



min 




(4.31) 



subject to 



X ~ u, rr(0) = 0, 



(4.32) 
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0 < w < 3, (4.33) 

I - 1 + (« - 2)^ > 0. (4.34) 

Solution. Prom the objective function (4.31), one can see that it is 
good to have low values of u. If we use = 0 to begin with, we see that 
x[t) = 0 as long as u{t) = 0. But continuing with u(t) = 0 beyond t = 1 
is not feasible since x(t) = 0 would not satisfy the constraint (4.34) just 
after t=l. 

Att— 1, the constraint (4.34) is satisfied with an equality; see Figure 
4.2. In order not to violate the constraint, its first derivative u — 2(2 — t) 
must be nonnegative. This gives us u(t) — 2(2 — t) to be the lowest 
feasible value for the control. This value of control will make the state 
x(t) ride on the constraint boundary imtil t = 2, at which point u(2) = 0; 
see Figure 4.2. Continuing with u(t) = 2(2 ~ t) beyond t = 2 will make 
u(t) negative, and violate the lower bound in (4.33). 



X 




Figure 4.2: Infeasible State Space and Optimal State Trajectory 
for Example 4.3 

It is easy to see, however, that u(t) = 0, t > 2, is the lowest feasible 
value, which can be followed all the way to the terminal time t = 3. 

We can now restate the values of the state and the control variables 
that we have obtained: 




no 
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x’{t) 



0 , 



t G [0, 1), 



0, t € [0, 1), 



< l-(^-2)^ te[i,2l, 



u*(t) = < 



2{2-t), 



< € [ 1 , 2 ], 



1, <e(2,3], 



0, <€(2,3]. 



Note that at the entry time t = 1 to the state constraint (4.34), the 
control u* and, therefore, = u* 2(t — 2) is discontinuous, i.e., the 
entry is non-tangential. On the other hand, u* and h^* are continuous 
at t = 2 so that the exit is tangential. 

With u* and x* thus obtained, we must obtain A, pi, C 

so that the necessary optimality conditions (4.29) hold, i.e.. 



H = -e-^P^u + An, 


(4.35) 


L~ H + Piu + P 2 {S -u) + T][u + 2{t - 2)], 


(4.36) 


Lu = —e P^ X-\- Pi — P2 V ~ 


(4.37) 


X = -L^ = 0, A(3) = 0, 


(4.38) 


Pi > 0, piu = 0, /X2 > 0, P2(S - u) = 0, 


(4.39) 


^ > 0, ?7 < 0, T}[x~l4r{t- 2)^] = 0, 


(4.40) 


A(l-) = A(l+)+C(l), C(l) >0. 


(4.41) 



By trial and error, we can easily obtain 



\{t) = { 



0, 1 < ^ < 3, 



as shown in Figure 4.3, 



p^{t) = 



0, l<t<2, = 0, 0 < t < 3, 

e-i>\ 2<t<3, 
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A 




Figxire 4.3: Adjoint Trajectory for Example 4.3 



and 



0 , 0 < ^ < 1 , 



r/(t) = 






0, 2 < t < 3, 



which, along with u* and x*, satisfy (4.29). 

Note, furthermore, that A is continuous at the exit time t = 2. At 
the entry time ti = 1 , ^( 1 ) = e~^ > r?(l“*') = so that (4.30) also 
holds. 



4.3 Current- Value Maximum Principle: 

Indirect Method 

Just as the necessary condition (3.41) represents the current-value formu- 
lation corresponding to (3.11), we can, when first-order state constraints 
are present, also state the current- value formulation of the necessary 
condition (4.29). 

With the Hamiltonian H as defined in (3.33), we can write the La^ 
grangian 

L[x^ u, = ^ (p Xf jig gh^. 

We can now state the current-value form of the maximum principle, 
which states that the necessary conditions for u* (with the state trajec- 
tory X*) to be an optimal control are that there exist A, /i,, a, j3^ 7 , 77 , and 
C, which satisfy (4.42) that follows: 
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X* ^ a:*(0)=a;o, 

satisfying constraints 
g{x*,u*,t) > 0 , h{x*{t),t) > 0 , 
and the terminal constraints 
a{x*{T),T) > 0 and b{x*{T),T) = 0; 

X~ pX — Lx[x*,u*,X,p, T],t] 
with the transversality conditions 
A(T-) = ax(x*{T),T) + oa.(x*(T),T) + ^bx{x*{T),T) 
-\-^hx(x*(T),T), and 

a > 0, aa(x*{T),T) = 0, 7 > 0, -fh{x*(T),T) = 0; 
the Hamiltonian maximizing condition 

A(t), t] > H[x*{t),u,X{t),t] 

at each t G [0, T] for aU u satisfying (4.42) 

g[x*{t),u,t] > 0 , and 

hj{x*(t), u,t) >0 whenever hi{x*{t), t) = 0 , i = 1 , 2 , • • • ,p; 
the jump conditions at any entry/contact time r are 
X{r~) = A(r+) +C(^)/ix(^*(r),r) and 
H[x*{r), u*{r-), A(r-), rj = H[x*{r), u*(r+), A(r+), r] 

-C{r)ht{x\T),T)\ 

the Lagrange multipliers p{t) are such that 

— 0, dH/dt — dL/dt = dLjdt + pA/, 
and the complementary slackness conditions 
p{t) > 0 , p{t)g{x*, u*, t) = 0 , 
v(i) > 0 , T](t) < prj(t), r}(t)h(x*(t),t) = 0 , and 
C(r) > 0 , C(r)/i(ic*(r),r) = 0 hold. 
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In Exercise 4.9, you are asked to redo Example 4.3 by using the 
current- value maximum principle (4.42). 



4.4 Sufficiency Conditions 

When first-order pure state constraints are present, sufficiency results 
are usually stated in terms of the maximum principle using the direct 
adjoining method described in Hartl, Sethi, and Vickson (1995). How- 
ever, since the direct and indirect methods are related as specified in 
their paper, the sufficiency results can be restated in the indirect adjoin- 
ing framework. In order to do so, let us define the Hamiltonian H and 
the Lagrangian in the direct method as 

H{x, li, t) = F{x, -a, t) -|- A^/(x, u, t) (4.43) 



and 

L*^(x^u, X^, ,t) = H{x,u,X^,t) 4 - fj,^g(x,u,t) + r]^h{x,t), (4.44) 

where X '^ , fj,^ and 77 ^ are multipliers in the direct formulation, correspond- 
ing to A, ^ and 77 in the indirect formulation. It is shown in Feichtinger 
and Hartl (1986) and Hartl, Sethi, and Vickson (1995) that 

X^(t) = X(t) rj{t)hx{x*(t),t) and (4.45) 

While all the multipliers including X^, and 77 ^ can be expressed in 
terms of the multipliers A, yU, a, /?, 7 , C? and 77 required in the indi- 
rect formulation, we shall only need X‘^{t) for formulating the sufficient 
conditions. 

We shall now state two sufficiency results for the problem specified 
in (3.1)-(3.5) and (4.26); see Feichtinger and Hartl (1986) and Seierstad 
and Sydsaeter (1987) for the proofs of the direct adjoining version of the 
following results. 

Theorem 4.1 Let (re*, u*. A, /i, a, /?, 7, C, 77) satisfy the necessary 
conditions in (4.29) and let X^{t) = X{t) -f rf{t)hx{x* {t) ^t) . If 
H{x,u, X‘^{t)^t) is concave in (a?,w) at each t G [0,T], S in (3.2) is 
concave in x, g in (3.3) is quasiconcave in {x.^u), h in (4.26) and a in 
(3.4) are quasiconcave in x, and b in (3.5) is linear in x, then (rc*,w*) 
is optimal. 
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Theorem 4.2 Theorem 4.1 remains valid if the concavity of 
H(x,u^ X^(t)^t) in {x^u) at each t is replaced by the concavity of the 
maximized Hamiltonian H^(x,X^{t),t) in x at each t, where 

H^{x,X^,t)= max H{x,u,X^,t). (4.46) 

Theorems 4.1 and 4.2 are written for finite horizon problems. For in- 
finite horizon problems, these theorems remain vahd if the transversaUty 
conditions on the adjoint variables in (4.29) is replaced by the following 
limiting transversality condition 

Um A(t)[x(t) — x*{t)] > 0 (4.47) 

t— >-oo 

for every feasible state trajectory x{t), t > 0, Note that when T = oo, 
we do not have to worry about the conditions on a, /?, and 7 in (4.42), 
since for infinite horizon problems, we do not impose conditions (3.4) 
and (3.5). 

We shall conclude the chapter by illustrating the apphcation of 
Theorem 4.1 to the solution of Example 4.1 obtained earlier. 



Example 4.1 (continued) First, we obtain the direct adjoint variable 



A^^(^) = A(^) +7j{t)hx{x*{t),t) = < 



t-2. 



0 , 



t ^ [O5 1)? 

ie[i,2]. 



It is easy to see that 



Hix,u,X‘^(t),t)={ 



-x-\-{t — 2)u, t G [ 0 , 1 ), 

-X, ie[l,2], 

is linear and hence concave in (x, u) at each t G [0, 2]. Functions 

/ 

g{x,u,t) = 



n + 1 
\ —u 



and 



h{x) -- X 
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are linear and hence quasiconcave in (x, u) and x, respectively. Functions 
S' = 0, a “ 0 and 6 = 0 satisfy the conditions of Theorem 4.1 trivially. 
Thus, the solution obtained for Example 4.1 satisfies all the conditions 
of Theorem 4.1, and is therefore optimal. 

In Exercise 4.12, you are asked to use Theorem 4.1 to verify a given 
solution to be optimal for the exercise. 

In concluding this section, we should note that the sufficiency condi- 
tions stated in Theorems 4.1 and 4.2 rely on the presence of appropriate 
concavity conditions. Sufficiency conditions can also be obtained with- 
out these concavity assumptions. These are called second-order condi- 
tions for a local maximum, which require the second variation on the 
linearized state equation to be negative definite. For further details on 
the second-order sufficiency conditions, the reader is referred to Maurer 
(1981), Malinowski (1997), and references in Hartl, Sethi, and Vickson 
(1995). 



EXERCISES FOR CHAPTER 4 



4.1 Rework Example 4.1 with terminal time T = 1/2. 

4.2 Change the objective function of Example 4.1 as follows: 

max I J — J (— w)dt| . 

Re-solve and show that the solution is not imique. 

4.3 Specialize the maximum principle (4.29) for the nonnegativity state 
constraint of the form 

x{t) >0 for all t satisfying 0 < t < T, 

in place of h{x,t) > 0 in (4.26), 

4.4* Consider the problem: 



max 




subject to 

X = —u — 1, a:(0) — 1, 
x{t) > 0, 0 < u{t) < 1. 

Show that 




116 



4. The Maximum Principle: General Inequality Constraints 



(a) If T = 1, there is exactly one feasible and optimal solution. 

(b) If T > 1, then there is no feasible solution. 

(c) If 0 < T < 1, then there is a unique optimal solution. 

(d) If the control constraint is 0 < u{t) < K, there is a unique 
optimal solution for every K > I and T — 1/2. 

(e) The value of the objective in (d) increases as K increases. 

(f) If the control constraint in (d) is u(t) > 0, then the opti- 
mal control is an impulse control defined by the limit of the 
solution in (e). 

4.5 Transform the problem with pure constraints of type (4.26) in Sec- 
tion 4.2 to a problem with nonnegativity constraints of type (4.1). 

[Hint: Define y = h(x,t) as an additional state variable. Recall 
that we have assumed (4.26) to be first-order constraints.] 

4.6* Consider a two-reservoir system such as that shown in Figure 4.4, 
where Xi(t) is the volume of water in reservoir i and Ui(t) is the 
rate of discharge from reservoir i at time t. Thus, 

i;i(t) = -ui{t), xi(0) = 4, 

X2{t) = ui{t) - U2{t), 0 : 2 ( 0 ) = 4. 




“2(0 



Figure 4.4: Two-Reservoir System of Exercise 4.6 
Solve the problem of maximizing 

rlQ 

J= [{10 — t)ui(t) 4- tu 2 {t)]dt 
Jo 
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subject to the above state equations and the constraints 

0 < Ui{t) < 1, Xi{t) > 0 for all t € [0, 10]. 

Also compute the optimal value of the objective fimction. 

[Hint: Guess the optimal solution and verify it by using the La^ 
grangian form of the maximum principle.] 

4.7* An Inventory Control Problem: Solve 

F ( P‘^\ 

max / — [hi 4- -rr dt 

P Jo \ 2 y 

subject to 

i = p-s, /(o) = /o> 

and the control and the pure state inequality constraints 

F > 0 and / > 0, 

respectively. Assume that S > 0 and h > 0 are constants and T 
is sufficiently large. Note that I represents inventory, P represents 
production rate, and S represents demand. The constraints on P 
and I mean that production must be nonnegative and backlogs are 
not allowed, respectively. 

[Hint: By T being sufficiently large, we mean T > Iq/S S/ (2/i).] 

4.8 Redo Example 4.1 with T = 1 . 5 . 

4.9 Redo Example 4.3 using the current-value maximum principle 
( 4 . 42 ) in Section 4 . 3 , 

4.10*For this exercise only, assume that h(x,t) > 0 in ( 4 . 26 ) is a second- 
order constraint, i.e., r = 2 , Transform the problem to one with 
nonnegativity constraints. Use the result in Exercise 4.3 to derive 
a maximum principle for problems with second-order constraints. 

[Hint: As in Exercise 4 . 5 , define y ~ h. In addition, define yet 
another state variable z = y = hJ . Note further that this procedure 
can be generalized to handle problems with rth-order constraints 
for any r] 

4.11 Re-solve Example 4.3 when r < 0. 
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4. The Maximum Principle: General Inequality Constraints 



4.12 Consider the following problem: 



min 





subject to the state equation 



X = u — X, x(0) = 1, 
and the control and state constraints 



0 < w < 1, x{t) > 0.7 - 0.2t 

Use the sufficiency conditions in Theorem 4.1 to verify that the 
optimal control for the problem is 



0 , 



= < 0.5-0.2t, 



o<t<e, 

e <t< 2.5, 



0, 2.5<t<5, 



where 0 ^ 0.51626. Sketch the optimal state trajectory x*(t) for 
the problem. 

4.13 Show that in Example 4.1, the values of A(t) and Pi{t) in the 
interval [0, 1) are not unique. Specifically, show that 

\{t) =t — a, Pi(t) = —X{t) = a — t, 0 <t < 1, 

C(l)-2-a, 

for any a G [1,2], along with the other values of the multipliers 
obtained in Example 4.1, also satisfy the maximum principle (4.29). 
See Maurer (1977, 1979) and Maurer and Wiegand (1992) for other 
examples with non-unique multipliers. 
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Applications to Finance 



An important area of finance involves making decisions regarding invest- 
ment and dividend policies over time and ways to finance them. Among 
the ways of financing such policies are: issuing equity, retaining earnings, 
borrowing money, etc. It is possible to model such situations as optimal 
control problems; see, for example, Davis and Elzinga (1972), Elton and 
Gruber (1975), and Sethi (1978b). Some of these models are simple to 
analyze and they yield useful insights. 

In this chapter we deal with a cash balance problem and a problem 
of optimal equity financing of a firm. The former, in its simplest form, is 
the problem of controlling the level of a firm’s cash balances to meet its 
demand for cash at minimum total cost. The latter, a central problem 
in finance, is that of determining the optimal dividend path along with 
new equity issued over time in order to maximize the value of the firm. 

Although we only deal with deterministic problems in this chapter, 
some of the more important problems in finance involve imcertainty. 
Thus, their optimization require the use of stochastic optimal control 
theory or stochastic programming. A brief introduction to stochastic 
optimal control theory will be provided in Chapter 13, together with an 
application to a stochastic consumption-investment problem and refer- 
ences. 

In the next section, we introduce a simple cash balance problem as 
a tutorial. This model is based on Sethi and Thompson (1970) and 
Sethi (1973d, 1978c). We shall be especially interested in the financial 
interpretations for the various functions such as the Hamiltonian and the 
adjoint functions that arise in the course of the analysis. 
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5.1 The Simple Cash Balance Problem 

Consider a firm which has a known demand for cash over time. To 
satisfy this cash demand, the firm must keep some cash on hand. If the 
firm keeps too much cash, it loses money in terms of opportunity cost, 
in that it can earn higher returns by buying securities such as bonds. 
On the other hand, if the cash balance is too small, the firm has to sell 
securities to meet the cash demand and thus incur a broker’s conunission. 
The problem then is to find the tradeoff between the cash and security 
balances. 

5.1.1 The Model 

To formulate the optimal control problem we introduce the following 
notation: 

T = the time horizon, 
x(t) = the cash balance in doUars at time t, 
y(t) = the security balance in dollars at time t, 
d(t) = the instantaneous rate of demand for cash; 
d{t) can be positive or negative, 

u(t) = the rate of sale of securities in doUars; a negative sales 
rate means a rate of purchase, 
ri(t) = the interest rate earned on the cash balance, 
r 2 {t) — the interest rate earned on the security balance, 

a = the broker’s commission in doUars per dollar’s worth of 
securities bought or sold; 0 < a < 1. 

The state equations are 

X = rix — d F u — a\u\, x{0) — xq, (^-1) 

y = r 2 y- u, y(0) = 2 / 0 , (5.2) 

and the control constraints are 

-U2 < u(t) < Ui, (5.3) 

where Ui and U 2 are nonnegative constants. The objective function is 
to 

maximize {J = [x(T) + 2 /(T)]} (5.4) 
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subject to (5.1)-(5.3). Note that the problem is in the linear Mayer form, 

5.1.2 Solution by the Maximum Principle 

Introduce the adjoint variables Ai and A 2 and define the Hamiltonian 
function 

H — Ai(rix ~ d-\-u — o:|u|) + ~ 't*)- (5.5) 

The adjoint variables satisfy the differential equations 

Ai = -— = -Ain, Ai(r) = l, (5.6) 

A 2 = = -Asrs, A2(T) =: 1. (5.7) 

dy 

It is easy to solve these, respectively, as 

Ai(t) (5.8) 

and 

A2(i) = e/r’-=W''’-. (5.9) 

The interpretations of these solutions are also clear. Namely, Ai(^) is 
the future value (at time T) of one dollar held in the cash account from 
time t to T and, likewise, A 2 (t) is the future value of one dollar invested 
in securities from time t to T. Thus, the adjoint variables have natural 
interpretations as the actuarial evaluations of competitive investments 
at each point of time. 

Let us now derive the optimal policy by choosing the control vari- 
able u to maximize the Hamiltonian in (5.5). In order to deal with the 
absolute value function we write the control variable u as the difference 
of two nonnegative variables, i.e.. 



U = Ui — U2, Ui >0, U2> 0, (5.10) 

Recall that this method was suggested in Remark 3.3 in Section 3.6 . In 
order to make u = ui when ui is strictly positive, and u = —U 2 when U 2 
is strictly positive, we also impose the quadratic constraint 

U 1 U 2 = 0, (5-11) 

so that at most one of ui and U 2 can be nonzero. However, the optimal 
properties of the solution will automatically cause this constraint to be 
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satisfied. The reason is that the broker’s commission must be paid on 
every transaction, which makes it not optimal to simultaneously buy and 
sell securities. Given (5.10) and (5.11) we can write 



|tt| = Ui + U2- (^-12) 

We can now substitute (5.10) and (5.12) into the Hamiltonian (5.5) and 
reproduce below the part which depends on control variables ui and U 2 , 
and denote it by W. Thus, 

W = m[(l - a)Ai - As] - n2[(l + a)Ai - As]. (5.13) 

Maximizing the Hamiltonian (5.5) with respect to u is the same as max- 
imizing W with respect to ui and ws- But W is linear in ui and ws so 

that the optimal strategy is bang-bang and is as follows: 

u* =ul- ul (5.14) 

where 

ul - bang[0, Ui; (1 - o)Ai - As], (5.15) 

= bang[0, C/s; -(1 + a)Ai + As]. (5.16) 

Since ui{t) represents the rate of sale of securities, (5.15) says that the 
optimal policy is: sell at the maximum allowable rate if the future value 
of a dollar less the broker’s commission (i.e., the future value of (1 — a) 
dollars) is greater than the future value of a dollar’s worth of securities; 
and do not sell if these future values are in reverse order. In case the 
future value of a doUar less the commission is exactly equal to the fu- 
ture value of a dollar’s worth of securities, then the optimal pohcy is 
undetermined. In fact, we are indifferent as to the action taken, and 
this is called singular control Similarly, U 2 {t) represents the purchase of 
securities. Here we buy, do not buy, or are indifferent, if the future value 
of a dollar plus the commission is less than, greater than, or equal to the 
future value of a dollar’s worth of securities, respectively. 

Note that if 

(l-a)Ai(t) > As(t), 

then 

(1 + o;)Ai(t) > As(t), 

so that if ui(t) > 0, then U 2 {t) = 0. Similarly, if 



(1 -|- Q:)Ai(C) < As(C) 
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A2 




Figure 5.1: Optimal Policy Shown in (Ai, A 2 ) Space 



then 



(l-a)Ai(^) < X2{t), 



so that if U 2 {t) > 0, then Ui(t) = 0. Hence, with the optimal policy, the 
relation(5.11) is always satisfied. 

Figure 5.1 illustrates the optimal policy at time t. The first quadrant 
is divided into three areas which represent different actions (including 
no action) to be taken. The dotted lines represent the singular control 
manifolds. A possible path of the vector (Ai(^), A 2 (^)) of the adjoint 
variables is shown in Figure 5.1 also. Note that on this path, there is one 
period of selhng, two periods of buying, and three periods of inactivity. 
Note also that the final point on the path is (1,1), since the terminal 
values Ai(T) ~ A 2 (T) = 1, and therefore, the last interval is always 
characterized by inactivity. 

Another way to represent the optimal path is in the (i, A 2 /A 1 ) space. 
The path of (Ai(i), A 2 (t)) shown in Figure 5.1 corresponds to the path 
of A 2 (^)/Ai(t) over time shown in Figure 5.2. 
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A2/A1 




Figure 5.2: Optimal Policy Shown in (t, A 2 /A 1 ) Space 

5.1.3 An Extension Disallowing Overdraft and 
Short-Selling 

We can also formulate the cash balance problem in which overdrafts and 
short-sales are disallowed. To do this mathematically we impose the 
additional constraints 



x(t) > 0 and y{t) > 0. (5.17) 

Furthermore, we relax the control constraints (5.3), i.e., we set Ui ~ 
f /2 = 00 for simplicity in exposition. In Exercise 5.2 you will be asked to 
carry out the parallel development when Ui and U 2 are finite. Generally, 
the presence of (5.17) will require us to use the maximum principle (4.29). 
Thus, we form the Lagrangian as in (4.28) or (4.4) to be 

L = H + T]iX + T) 2 y = Ai(ria: - d + u ~ a\u\) A X 2 {r 2 y ~ u) 

Fpiinx -d-hu- a\u\) + 772 (^ 22 / ~ w), (5.18) 

where the adjoint variables satisfy 

O T 

Ai = = -(Ai +7?i)n, Ai(T-) > 1, [Ai(T) - l]x(T) = 0, (5.19) 

A2 = ~ = -(A2 + %)r-2, MT-) > 1, IHT) - l]y{T) = 0, (5.20) 
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the Lagrange multipliers and 7/2 satisfy the complementary slackness 
conditions 

= 0 , i7i{i)[ria;(i)-d(«)+u(i)-a|u(«)|l = 0, (5.21) 
%(i) > 0, Ti2{t)v(t) = 0, %(t)[r22/(i) - «(<)) = 0, (5.22) 

and 

g = 0 (5.23) 

for all t € [0, T]. Note that the transversality conditions in (5.19) and 
(5.20) on the adjoint variables are written as in Row 3 of Table 3.1. This 
form is easily seen to be equivalent to the transversality condition in 
(4.29). 

In general, the solution of this problem is difficult and requires the 
use of a computer. However, we illustrate an easy case in which o = 0 
and ri and V 2 are piecewise-constant functions of time. 



Example 5.1 (We are indebted to C. Norstrom for this example.) Con- 
sider the model when a — 0, T = 10, and ri and V 2 vary over time as 
follows: 



ri{t) = 



0 for 0 < t < 5, 

0.3 for 5 < t < 10, 
r 2 {t) = 0.1 for 0 < t < 10. 



(5.24) 

(5.25) 



The initial cash and security balances are, respectively. 



2^0 = 0 and yo = 3. 

For convenience in exposition, we assume d{t) = 0, 0 < t < 10; see 

Remark 5.1 at the end of this subsection. The optimal solution when 

there is no upper bound on the control is easy to guess. It is 

u*{t) = 0 for 0 < t < 5, (5.26) 

u*{t) = 0 for 5 < t < 10. (5.27) 

At t = 5, the optimal action clearly is to seU all the securities instanta- 
neously to take advantage of the higher interest rate on cash. We can do 
this because there is no upper bound on the rate of sales. Such a control 
is called an impulse control. It can be conceived to be the result of sell- 
ing securities at a very large rate for a very short time so that the entire 
security balance is converted into cash more or less instantaneously. 
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To formalize the impulse control at i = 5, we first use (5.26) in (5.2) 
to obtain 

y{5~) = 3e°-^. 

This in words means that the security balance just before time t = 5 is 
obtained by earning interest at rate 0.1 compounded from time 0 
to 5 on the initial security balance of 3. 

Let us now sell securities at the rate of 3e^'^/(26t) for a short time 
interval of 2St beginning at time t = 5~St and ending at time t = 5 + St, 
i.e., 

u*(t) = t € [5 - 5 - St ] . 

Using this control in (5.2), we can integrate the resulting equation from 
t ~ b — St to t = 5 + St hy applying the formula (A. 1.6). Thus, we have 



y(5 + 6t) = y{5 - 

J5-6t ^St 

0 ^ 0.5 

= 3/(5-«t)e“'^^‘-^[0.2a + 0(«)]]. 



(5.28) 



Taking the limit of (5,28) as St 0, we obtain 

2/(5+) = 2/(5") - 3e®-^ = 3e°'^ - 3e°-^ = 0. 

Note that as ^ 0, we have u*(t), defined above, go to infinity. The 
net effect of 0 and u*(t) — ^ oo is that the entire security balance is 
sold instantaneously at time 5, giving us i/(5+) = 0, i.e., a zero security 
balance just after t = 5. 

Note that cc(5“) = 0. Following the impulse sale of securities at t = 5, 
2 ; (5+) = 3e®'^. Further discussion and applications of impulse controls 
will be given in Chapters 7, 10, and 12. 

We must now find the adjoint variables Aj and A 2 and Lagrange 
multipliers and 7/2 so that the maximum principle holds. FYom (5.23) 
with a = 0, we get Ai — A 2 + t/i — 7/2 “ ^ 

A T)i = ^2 A t]2' (5.29) 

We now solve for these quantities in three time intervals 0<t<5,t = 5, 
and 5 < ^ < 10. We will start with t in the last interval: 
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(a) Assume t e (5, 10]. Since rc > 0 we know jji = 0. Using (5.29) we 
can write (5.19) and (5.20) as 

A'l = -{Ai + = -0.3Ai, Ai(lO) = 1, 

Aj = -(A2 + %)r2(i) = -O.lAi, A2(10) = 1. 

Solving these we have 

Ai(t) = 

A2W = = | + 

= Ai-A2 = |[e‘>-»(i“-*>-l], 

which are displayed in Figure 5.3. Since Ai(^) > \2(i) in this interval and 
y(t) = 0, it follows that u*(t) = 0 maximizes the Hamiltonian subject to 
the constraints (5.17). 

(b) Assume ^ = 5. As mentioned previously, the optimal action is 
to apply impulse u*(5) = imp(3e®-^, 0, 5). In Exercise 5.10, you will 
be asked to re-solve this problem when —1 < u < 1, and compare the 
solution with (5.26)-(5.29) and Figure 5.3. 

(c) Assume t € [0, 5). In this interval y > 0 so that r )2 = 0 and 

= A2 — Ai. Thus, the adjoint equations (5.19) and (5.20) reduce to 

Ai = -(Ai +77i)ri(t) = 0, Ai(5“) = e^-^, 

A2 = — (A2 +??2)^2(^) = —0.1A2, A2(5 ) — e^*^. 

Solving these we obtain 

AiW = X 2 {t) = rjiit) = - 1], 

Since A2(t) > Ai(t) in this interval and x{t) = 0, it follows that 
Ai(5) = A2(5) because ^ = 5 is the switching point. Moreover both Ai 
and A2 are always nonincreasing. Hence, A2 must jump at i = 5; see 
Figure 5.3. In the interval [0, 5), A2 > Ai so that it would be optimal to 
buy if there were cash on hand. Since x{t) = 0, it follows that u*{t) ~ 0 
maximizes the Hamiltonian subject to the constraints (5.17). Note that 
if x(0) were positive, the optimal policy would (obviously) be to use 
2:(0) to instantaneously buy securities at t = 0. A final remark is that 
A2(5~) = A2(5‘*') + which is the jump condition specified in 

Section 4.1. 
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"Hu 'H2 




Figure 5.3: Adjoint Variables and Lagrange Multipliers for Example 5.1 



Note that in Figure 5.3, 

Ai(0) = Ai(5-) = Ai(5+) = A2(5-) = 4.48, 

A2(0) = 7.38, 

A2(5+) = (2 + ei-5)/3^^2.16, 

r;j(0) = A2(0) - Ai(0) - _ ^i.s ^ 2.90, 

7 / 2 ( 5 +) = Ai(5+) - A2(5+) - 2[e+" - l]/3 ^ 2.32. 



In Exercise 5.2, you are asked to extend the formulation in Section 
5.1.3 to allow for finite boimds on sale and purchase rates. 

Remark 5.1 Note that if d{t) ^ 0, it is easy to modify the above solu- 
tion to obtain the optimal solution, provided there is a feasible solution 
for the given demand d(t), 0 < i < 10. It is the solution which keeps all 
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its assets in securities for 0 < t < 5 and in cash for 5 < t < 10, while 
meeting its cash demand obUgations. 

5.2 Optimal Financing Model 

In the present section, we discuss a model of a firm which must finance its 
investments by an optimal combination of retained earnings and external 
equity. The model to be discussed is due to Krouse and Lee (1973), with 
corrections and extensions due to Sethi (1978b). The problem of the 
optimal financing of the firm can be formulated as an optimal control 
problem. The formulations, such as those of Davis (1970), Krouse (1972), 
and Krouse and Lee (1973), permit the firm to finance its investments 
by retained earnings, debt, and/or external equity in various proportions 
which may vary over time. Note that earnings not retained are paid out 
as dividends to the firm’s stockholders. 

For reasons of simplicity and ease of its solution, the model analyzed 
here does not permit debt as a source of financing, but does permit 
retained earnings and external equity to be used in any proportions. 

5.2.1 The Model 

In order to formulate the model, we use the following notation: 

y(t^ =z the value of the firm’s assets or invested capital at time t, 
x{t) = the current earnings rate in dollars per unit time at time t, 
u{t) = the external or new equity financing expressed as a 
multiple of current earnings; li > 0, 
v{t) — the fraction of current earnings retained, i.e., 1 — v{t) 
represents the rate of dividend payout; 0 < v{t) < 1, 

1 — c = the proportional floatation (i.e., transaction) cost 
for external equity; c a constant, 0 < c < 1, 
p the continuous discoimt rate (assumed constant); known 
commonly as the stockholder’s required rate of return, 
r = the actual rate of return (assumed constant) on the firm’s 
invested capital; r > p, 

g ~ the upper bound on the growth rate of the firm’s assets 
T = the planning horizon; T < oo (T = oo in Section 5.2.4). 




130 



5. Applications to Finance 



Given these definitions, the current earnings are x = ry. The rate of 
change in current earnings is given by 

X = ry = r{cu + v)x, a:(0) = xq. (5.30) 

Furthermore, the upper bound on the rate of growth of the assets implies 
the following constraint on the control variables: 

y/y = (^cu -|- v)x/{x/r) ~ r{cu + tj) < g. (5.31) 

Finally, the objective of the firm is to maximize its value, which is 
taken to be the present value of the future dividend stream accruing to 
the shares outstanding at time zero. To derive this expression, note that 

(1 — v)xe~^^dt 




represents the present value of total dividends issued by the firm. A 
portion of these dividends go to the new equity, which under the as- 
sumption of an efficient market will get a rate of return exactly equal to 
the discoimt rate p. This should therefore be equal to the present value 

uxe~^^dt 

of the external equity raised over time. 

Thus, the net present value of the total future dividends that accrue 
to the initial shares is the difference of the above two expressions, i.e.. 




J = 




u)xdt; 



(5.32) 



see Miller and Modigliani (1961), Sethi, Derzko, and Lehoczky (1982), 
and Sethi (1996) for further discussion. Note that in the case of a finite 
horizon, a more realistic objective function would include a salvage value 
or bequest term 6'[o:(r)]. This is not very difficult to incorporate. See 
Exercise 5.9 where the bequest function is linear. We will also solve the 
infinite horizon problem (i.e., T = oo) after we have solved the finite 
horizon problem. 

The optimal control problem is to choose u and v over time so as 
to maximize J in (5.32) subject to (5.30), the constraints (5.31), u>0, 
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and 0 < ?; < 1. For convenience, we restate this problem as 

fT 

J = / e ^^(1 — u — v)xdt 
Jo 

subject to 

X = r{cu -\-v)x, x(0) = 
and the control constraints 
cu-\-v<glr, w > 0, 0 < ?; < 1. 



max 

u,v 



(5.33) 



5.2.2 Application of the Maximum Principle 

This is a bihnear problem with two control variables which is a special 
case of Row (/) in Table 3.3, The current-value Hamiltonian is 

H = (1 —V — u)x + Xr{cu + v)x, (5.34) 

where the current-value adjoint variable A satisfies 

X = pX — (1 — V — u) — Xr{cu -1- v) (5.35) 

with the transversality condition 

A(T) - 0. (5.36) 

Prom the Hamiltonian in (5.34) and Section 2.2.2, we know that 
A(/:) can be interpreted as the marginal value (in time t dollars) of a unit 
change in earnings at time t. Also (cu-\-v)x is the incremental investment 
at time t. Thus, Xr is the marginal value of a unit investment at time 
t. Therefore, the product Xr{cu + v)x is the value to the stockholders of 
the incremental investment measured in terms of foregone dividends (in 
other words in time t dollars). We can also interpret (5.35) as in Section 
2.2.4. More specifically, if the firm makes an incremental investment of 
A (which gives an incremental earning of one dollar) , pX is the expected 
return from this investment. In equilibrium this must be equal to the 
“capital gain” A, plus the immediate dividend (1 — v) less u, the “claim” 
of the external stockholders, plus the value of the incremental earnings 
Xr{cu v). 

To specify the form of optimal policy, we rewrite the Hamiltonian as 
H = [Wiu + W 2 V -f l]x, (5.37) 
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where 

Wi=:crX- 1, (5.38) 

W2=-rX-l. (5.39) 

Note first that the state variable x factors out so that the optimal 
controls are independent of the state variable. Second, since the Hamil- 
tonian is hnear in the two control variables, the optimal pohcy is a com- 
bination of generalized bang-bang and singular controls. Of course, the 
characterization of these optimal controls in terms of the adjoint variable 
A will require solving a parametric linear programming problem at each 
instant of time t. The Hamiltonian maximization problem can be stated 
as follows: 

max \Wiu + W 2 v\ 

u,v 

subject to (5.40) 

u > 0, 0 < u < 1, cuAv<g/r. 

Obviously, the constraint u < 1 becomes redundant \i g/r < 1. There- 
fore, we have two cases: 

Case Ai g <r and Case B: ^ > r, 

under each of which, we can solve the hnear programming problem (5.40) 
graphically in a closed form. This is done in Figures 5.4 and 5.5. 

Since c < 1 by assumption, the following subcases shown in Figures 
5.4 and 5.5 can be ruled out: 

(i) Wi > cW 2 , Wi > 0 => c> 1, ruling out Subcases A2 and B2. 

(ii) Wi = 0, W 2 <0=^0 1, ruling out Subcases A4 and B5. 

(iii) Wi = CW 2 > 0 => c = 1, ruling out Subcases A5 and B6. 

(iv) Wi = W 2 = 0 c— 1, ruling out Subcases A7 and B9. 

For the remaining subcases shown adjacent to the darkened lines in 
Figures 5.4 and 5.5, we characterize the corresponding optimal controls 
in Table 5.1. 

The catalog of possible optimal control regimes shown in Table 5.1 
gives the potential time-paths for the firm. What must be done to ob- 
tain the optimal path (given an initial condition) is to synthesize these 
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V 




Wi<0,W2<0 W^ = 0,W2<0 



Figure 5.4: Case A: g <r 



subcases into an optimal sequence. This is carried out in the following 
section. 

Before proceeding with our synthesis, we note that since we have 
assumed c < 1 in this chapter, we have Wi < cW 2 from (5.38) and 
(5.39). We can, therefore, characterize Subcase A3 simply by W 2 > 0 
and Subcase B3 simply by W\ >0. It is these simpler characterizations 
of Subcases A3 and B3 that we shall use in our subsequent discussion. 

5.2.3 Synthesis of Optimal Control Paths 

To obtain an optimal path, we must synthesize an optimal sequence 
of subcases. The usual procedure employed is that of the reverse-time 
construction, first developed by Isaacs (1965). Reverse time can only be 
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V 




Figure 5.5: Case A: g > r 

defined for finite horizon problems. However, the infinite horizon solution 
can usually be inferred from the finite horizon solution if sufficient care 
is exercised. This will be done in Section 5.2.4. 

Our analysis of the finite horizon problem (5.33) proceeds with the 
assumption that the terminal time T is assumed to be sufficiently large. 
We shall make this assumption precise during our analysis. Moreover, 
we shall discuss the solution when T is not sufficiently large in Remarks 
5.2 and 5.4. 

Define the reverse-time variable r as 

T = T — t, 

o dy dy dt 

^ dr dt dr 



so that 
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Row 


Conditions on 


Case A: 
9 


Case B; 
g > r 








Wi, W2 


Subcases 


Subcases 


Optimal Controls 


Characterization 


(1) 


Wi <0, W2 < 0 


A1 


B1 


u* = 0, V* — 0 


generalized 

bang-bang 


(2) 


Wi < CW2, W2 > 0 


A3 


- 


u* — 0, V* — g/r 


generalized 

bang-bang 


(3) 


Wi < 0, W 2 = 0 


A6 


B8 


u* = 0, 

0 < u* < min[l, g/r] 


singular 


(4) 


0 < Wi < cW2 


- 


m 


u» = (g ~ r)/rc,v* — 1 


generalized 

bang-bang 


(5) 


Wi < 0, W 2 > 0 


- 


m 


u* = 0,u* = 1 


generalized 

bang-bang 


(6) 


Wi = 0, W 2 > 0 


“ 


B7 


0 < tt* < (g —r)/rc, 
V* = 1 


singular 



Table 5.1: Characterization of Optimal Controls 



As a consequence, V— —y, and the reverse-time versions of the state 
and adjoint equations (5.30) and (5.35), respectively, can be obtained by 

O 

simply replacing y hy y and changing the signs of the right-hand sides. 
The transversahty condition on the adjoint variable 

\{t = T) = X{r = 0) = 0 (5.41) 

becomes the initial condition in the reverse-time sense. Furthermore, let 
us parameterize the terminal state by assuming that 

x{t = T) = x{t = 0) = QA, (5.42) 

where a a is a parameter to be determined. 

Firom now on in this section, everything is expressed in reverse-time 

o ® 

sense unless otherwise specified. Using the definitions of x and A and the 
conditions (5.42) and (5.41), we can write reverse-time versions of (5.30) 
and (5.35) as follows: 




cu + v)x, x(0) = aA, 



(5.43) 



o 



A= (1 — u — «) — A{p — r(«t + t»)}, A(0) = 0. 



(5.44) 
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This is the starting point for our switching point synthesis. First, we 
consider Case A. 

Case A: g <r. 

Note that the constraint u < 1 is superfluous in this case and the 
only feasible subcases are Al, A3, and A6. Since A(0) = 0, we have 
IFi(O) = 1 ^ 2 ( 0 ) = — 1, and Subcase Al obtains. 

Subcase Al: Wi = crX — 1 < 0 and VF 2 = rA - 1 < 0. 

Prom Row (1) of Table 5.1 we have -n* = — 0, which gives the 

state equation (5.43) and the adjoint equation (5.44) as 

x= 0 and A= 1 — pX. (5.45) 

With the initial conditions given in (5.41), the solutions for x and A are 

x{r) = a A and A(r) ~ {1/ p)[l — (5.46) 

It is easy to see that because of the assumption 0 < c < 1, it follows that 
if W 2 = rA — 1 <0, then W\ — crX — 1 < 0. Therefore, to remain in 
this subcase as t increases, W^ir) must remain negative for some time 
as r increases. Prom (5.46) however, A(r) is increasing asymptotically 
toward the value 1/ p and therefore, W 2 (r) is increasing asymptotically 
toward the value rfp — 1. Since, we have assmned r > p, there exists a 
Ti such that W 2 {ti) = (1 — e~P^'^)rfp — 1 = 0. It is easy to compute 

Ti = (1/p) ln[r/(r - p)]. (5.47) 

From this expression, it is clear that the Arm leaves Subcase Al provided 
Ti < T. Moreover, this observation also makes precise the notion of a 
sufficiently large T in Case A by having T > ri. 

Remark 5.2 When T is not sufficiently large, i.e., when T < ri in 
Case A, the firm stays in Subcase A3. The optimal solution in this case 
is ti* = 0 and v* = 0, i.e., a policy of no investment. 

Remark 5.3 Note that if we had assumed r < p, the firm never exits 
from Subcase Al regardless of the value of T. Obviously, there is no use 
investing if the rate of return is less than the discount rate. 
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At reverse time we have W 2 = 0 and Wi < 0 and the firm, 
therefore, is in Subcase A6. 

Subcase A6: Wi = crX — 1 < 0 and W 2 = rA — 1 = 0. 

In this subcase, the optimal controls 

u* = 0, 0 <v* < g/r (5.48) 

from Row (3) of Table 5.1 are singular with respect to v. This case is 
termed singular because the Hamiltonian maximizing condition does not 
yield a unique value for the control v. In such cases, the optimal controls 
are obtained by conditions required to sustain W 2 = 0 for a finite time 

O O 

interval. This means we must have 0, which in turn implies A= 0. 

O 

To compute A, we substitute (5.48) into (5.44) and obtain 

A= (1 — t?) — \[p — rv*]. (5.49) 

Substituting A — 1/r (since W 2 = 0) in (5.49) and equating the right- 

hand side to zero we obtain 

r = p (5.50) 

as a necessary condition required to maintain singularity over a finite 
time interval following r\. Condition (5.50) is fortuitous and will not 
generally hold. In fact we have assumed r > p. Thus, the firm will 
not stay in Subcase A6 for a finite time interval. Furthermore, since 

o 

r > p, we have A (ti) = (r — p/r) > 0. Therefore, W 2 is increasing 
from zero and becomes positive after t\. Thus, at rf the firm switches 
to Subcase A3. 

Subcase A3: W 2 = rA — 1 > 0. 

The optimal controls in this subcase from Row (2) of Table (5.1) are 

u* = 0, = g/r. (5.51) 

The state and the adjoint equations are 

x= -gx, x(ri) = Oa, (5.52) 

A= (1 -g/r)- \(p - g), A(ri) = 1/r, 
with values at r = ri deduced from (5.46) and (5.47). 



(5.53) 
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Since A (ri) >0, A is increasing at ri from its value of 1/r. A 
further examination of the behavior of A(r) as r increases will be carried 
out under two different possible conditions: (i) p > g and (ii) p < g. 

O 

(\) p > g \ Under this condition, as A increases, A decreases and 
becomes zero at a value obtained by equating the right-hand side of 
(5.53) to zero, i.e,, at 

p-9 

This value A is, therefore, an asymptote to the solution of (5.53) starting 
at A(ti) = 1/r. Since r > p > p in this case. 




W 2 = rX- I = 



r(l - g/r) 

p-g 






r — p 
P-9 



> 0 , 



(5.55) 



which implies that the firm continues to stay in Subcase A3. 

O 

(ii) p < g : Under this condition, as A(r) increases, A (r) increases. 
So W 2 {t) = rA(r) — 1 continues to be greater than zero and the firm 
continues to remain in Subcase A3. 



Remark 5.4 With p < g, note that A(r) increases to infinity as r in- 
creases to infinity. This has important implications later when we deal 
with the solution of the infinite horizon problem. 

Since the optimal decisions for r > ri have been found to be inde- 
pendent of a A for T sufficiently large, we can sketch the solution for Case 
A in Figure 5.6 starting with Xq. This also gives the value of 

a A ~ — xqc^^[1 — 

as shown in Figure 5.6. 

In the solution for Case A, there is only one switching point provided 
T is sufficiently large (i.e., T > ri in this case). The switching time 
t = T — Ti has an interesting economic interpretation. Namely, it requires 
at least t\ units of time to retain a dollar of earnings (for investment) to 
be worthwhile. That means, it pays to invest as much earnings as feasible 
before T — ti, and it does not pay to invest any earnings after T — ri. 
Thus, T — Ti is the point of indifference between retaining earnings or 
paying dividends out of earnings. To see this directly, let us suppose 
the firm retains one dollar of earnings at T — t\. Since this is the last 
time any earnings invested will be worthwhile, it is obvious (because 
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X 




i = T X = Ti = {l/p) ln[r/{r-p)] x = 0 



Figure 5.6: Optimal Path for Case A: g <r 



all earnings are paid out) that the dollar just invested at T — r\ yields 
dividends at the rate r from T — ri to T. The value of this dividend 
stream in terms of (T — ri)-dollars is 

Cre-'’*dt = -[l-e-'”'M, (5.56) 

Jo p 

which must be equated to one doUar to find the indifference point. Equat- 
ing (5.56) to 1 yields precisely the value of ti given in (5.47). 

With this interpretation of ti, we conclude that enough earnings 
must be retained so as to make the firm grow exponentially at the max- 
imum rate of g until t = T — ri. After this time, all of the earnings are 
paid out and the firm stops growing. Since g <r (assumed for Case A), 
the growth in the first part of the solution can be financed entirely from 
retained earnings. Thus, there is no need to resort to more expensive 
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external equity financing. The latter will not be the case, however, in 
Case B when g > which we now discuss. 

Case B: p > r. 

Since g/r > 1, the constraint u < 1 in Case B is relevant. The feasible 
subcases are Bl, B8, B4, B7, and B3 shown adjacent to the darkened 
lines in Figm-e 5.5. As in Case A, it is obvious that the firm starts (in 
the reverse-time sense) in Subcase Bl. Recall that T is assumed to be 
sufficiently large here as well. This statement in Case B will be made 
precise in the course of our analysis. Furthermore, the solution when T 
is not sufficiently large in Case B will be discussed in Remark 5.4. 

Subcase Bl: = crX — 1 < 0, 1^2 = rA — 1 < 0. 

The analysis of this subcase is the same as Subcase Al. As in that 
subcase the firm switches out at time r = to Subcase B8. 

Subcase B8: Wj = crX — 1 < 0, PF = rA — 1 = 0. 

In this subcase, the optimal controls 

= 0, 0 < ^* < 1 (5.57) 

from Row (3) of Table 5.1 are singular with respect to v. As before 
in Subcase A6, the singular case cannot be sustained for a finite time 
because of our assumption r > p. As in Subcase A6, W 2 is increasing at 
Ti from zero and becomes positive after ri. Thus, at r^, the firm finds 
itself in Subcase B4. 

Subcase B4: Wi = crX — 1 < 0, W 2 = rA — 1 > 0 . 

The optimal controls in this subcase are 

u* = 0, V* = 1, (5.58) 

as shown in Row (5) of Table 5.1. The state and the adjoint equations 
are 

x= —rx, x{ti) ~ as (5.59) 

with 0^5 a parameter to be determined, and 

A= A(r - p), A(ri) = 1/r. (5.60) 

Obviously, earnings are growing exponentially at rate r and A(r) is in- 
creasing at rate (r — p) as r increases from ti. Since A(ri) = 1/r, we 
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have 

A(t) = for r > ri. (5.61) 

As A increases, Wi increases and becomes zero at a time T 2 defined by 

^ 1 (^ 2 ) = crA(r2) — 1 = — 1 = 0, (5.62) 



which, in turn, gives 



^2 = 7-1 + [l/(r - p)] ln(l/c). (5.63) 

At T^, the firm switches to Subcase B7. 

Before proceeding to Subcase B7, let us observe that in Case B, we 
can now define T to be sufficiently large when T > T 2 . See Remark 5.4 
when T < T 2 . 



Subcase B7; Wi = crA — 1 = 0, W 2 = rX — 1 > 0. 

In Subcase B7, the optimal controls are 

0 < u* < {g — r)/rc, v* = 1. (5.64) 

From Row (6) in Table 5.1, these controls are singular with respect to 
u. To maintain this singular control over a finite time period, we must 

O 

keep Wi ~ 0 in the interval. This means we must have Wi (T 2 ) = 0, 

O 

which, in turn, implies A(t 2 ) = 0. To compute A, we substitute (5.64) 
into (5.44) and obtain 

A= ~u* - \{p - r{cu* + 1)}. (5.65) 

Substituting A(t 2 ) — 1/rc (since W\{r 2 ) = 0) in (5.65) and equating the 
right-hand side to zero, we obtain 

r = p. 

By our assumption r > p, a singular path cannot be sustained and the 
firm will not stay in Subcase B7 for a finite amount of time. Furthermore, 
from (5.65), we have 

A (rj) = > 0, (5.66) 

rc 

which implies that A is increasing and therefore, is increasing. Thus 
at r^, the firm switches to Subcase B3. 
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Subcase B3: Wi = crX — 1 > Q. 

The optimal controls in this subcase from Row (4) of Table 5.1 are 



^*=1. 


(5.67) 


rc 




The reverse-time state and the adjoint equations are 




x= -gx, 


(5.68) 


\=-{l^)+X{g-p). 


(5.69) 



Since A (t 2 ) > 0, A(r) is increasing. Furthermore, in Case B, we assume 
g > r. But r > p has been assumed throughout the chapter. Therefore, 
p < g and the second term in the right-hand side of (5.69) is increasing. 

o 

That means A (r) >0 and A(r) continues to increase. Therefore, the 
firm continues to stay in Subcase B3. 

Remark 5.5 Note that A(r) in Case B increases without bound as r 
becomes large. This will have important implications when dealing with 
the infinite horizon problem in Section 5.2.4. 

We now can sketch the complete solution for Case B in Figure 5.7. 

In the solution for Case B, there are two switching points instead 
of just one as in Case A. The reason for two switching points becomes 
quite clear when we interpret the significance of ri and T 2 . It is obvious 
that Ti has the same meaning as before. Namely, if rj is the remaining 
time to the horizon, the firm is indifferent between investing a dollar of 
earnings or paying it out as dividends. Intuitively, it seems that since 
external equity is more expensive than retained earnings as a source of 
financing, investment financed by external equity requires more time to 
be worthwhile. Thus, 

T 2 ~ri = — ln(l/c) (5.70) 

r — p 

should be the time required to compensate for the floatation cost of 
external equity. 

To see this, we suppose that the firm issues a dollar’s worth of stock 

t = T — T 2 - While the cost of this issue is one dollar, the capital 
acquired is c dollars because of the floatation cost (1 — c). Since we are 
attempting to find the breakeven time for external equity, it is obvious 
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t = 0 t=T-X2 t = T-Xi t = T 

t = T T = T2 T = Ti x = 0 



Figure 5.7: Optimal Path for Case B: ^ > r 



that retaining all of the earnings for investment is still profitable. Thus, 
there is no dividend from T — T 2 to T — ti and the firm is growing at 
the rate r. Therefore, the value of this investment at (T — T 2 ) measured 
in (T — T 2 )-dollars is 



ce^r-p){T2-ri) ^ ce^(i/c) ^ (5 71) 

Equation (5.71) states that one (T — T 2 )-doUar of external equity at 
time (T ~ r 2 ), which brings in c dollars of capital at time T — T 2 , is 
equivalent to one (T ~ T 2 )'-doUar investment at (T — ri). But the firm 
is indifferent between investing or not investing the costless retained 
earnings at (T — ti). To sununarize, the firm is indifferent between 
issuing a dollar’s worth of stock at {T — T 2 ) or not issuing it. Before 
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(T ~ T 2 ), it pays to issue stocks at as large a rate as is feasible. After 
(T — T 2 ), it does not pay to issue any external equity at all. 

Remark 5.6 When T is not sufficiently large, i.e., when T < T 2 in Case 
B, the optimal solution is the same as in Remark 5.2 when T < t\. If 
T\ < T < T2, then the optimal solution is i/* = 0 and t;* = 1 until 
t = T — Ti. For t > T — Ti, the optimal solution is n* = 0 and v* = 0. 

Having completely solved the finite horizon case, we now turn to the 
infinite horizon case. 

5.2.4 Solution for the Infinite Horizon Problem 

As indicated in Section 3.5, for the infinite horizon case the transversality 
condition must be changed to 

lim e-P^X(t) = 0. (5.72) 

t— >00 

Furthermore, this condition may no longer be a necessary condition; 
see Section 3.5. It is a sufficient condition for optimality however, in 
conjimction with the other sufficiency conditions stated in Theorem 2.1. 

A usual method of solving an infinite horizon problem is to take the 
limit as T — > 00 of the finite horizon solution and then proving that 
the limiting solution so obtained solves the infinite horizon problem. 
The proof is important because the limit of the solution may or may 
not solve the infinite horizon problem. The proof is usually based on 
the sufficiency conditions of Theorem 2.1 modified slightly as indicated 
above for the infinite horizon case. 

We now analyze the infinite horizon case following the above proce- 
dure. We start with Case A. 

Case A: g <r. 

The limiting solution in this case is given as Subcase A3, i.e., 

lA = 0, V = gfr, (5.73) 

x= gx, x{0) = xo, (5.74) 



and 



A= -(1 - g/r) - \(g - p), lim e = 0. 



(5.75) 
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To prove that the solution (5.73) is optimal, we must show that there 
exists a solution of (5.75) which with (5.73) and (5.74) satisfies the max- 
imum principle (i.e., stays in Subcase A3). 

Going back to Subcase A3 in Section 5.2.3 (see (5.54)), we note that 
in the reverse-time sense, A(r) is increasing asymptotically toward the 
value 

A = i^ 

p-9 

only in case p> g. Otherwise, i.e., when p < g, as we noted in Remark 
5.4, A(r) increases without bound. Thus, ioi p > g in which case r > 
p > g, \ = X clearly satisfies (5.75). Furthermore, 

W 2 = r\-l = > 0, 

P-9 

which implies that the firm stays in Subcase A3, i.e., the maximum 
principle holds. 

We have now proved the following result: ior p> g in Case A, 




u* =0,v* = gjr, 



along with the corresponding state trajectory 

X*{P) = 



and the adjoint trajectory 



A = i^, 

p-g 

represent an optimal solution for the infinite horizon problem. Note that 
the assumption p > g together with our overall assumption that p < r 
gives g < r so that I — v* > 0, which means a constant fraction of 
earnings is being paid as dividends. 

Note that the value of the adjoint variable A in this case is a constant 
and its form is reminiscent of the Gordon’s classic formula; see Gordon 
(1962). In the control theory framework, the value of A represents the 
marginal worth per additional unit of earnings. Obviously, a unit in- 
crease in earnings wiU mean an increase oi 1 — v* or I — g/r units in 
dividends. This, of course, should be capitalized at a rate equal to the 
discount rate less the growth rate (i.e., p—g)] which is precisely Gordon’s 
formula. 





146 



5. Applications to Finance 



For p < g, the reverse-time construction in Subcase A3 implies that 
A(r) increases without bound as r increases. Thus, we cannot find any 
A which satisfies (5.75). A moment’s reflection shows that for p < g, 
the objective function can be made infinite. For example, any control 
policy with earnings growing at rate q, p < q < g^ coupled with a partial 
dividend payout, i.e. a constant v such that 0 < ?; < 1, gives an infinite 
value for the objective function. That is, with u* = 0,v* = qjr <1, we 
have 

POO POO 

J= — u~ v)xdt = / — v)x(ie^^ — oo. 

Jo Jo 

Since there are many policies which give an infinite value to the 
objective function, the choice among them may be decided on subjective 
grounds. We shall briefly discuss only the constant (over time) optimal 
policies. If ^ < r, then the rate of growth q may be chosen in the closed 
interval [p, if ^ = r, then q may be chosen in the half-open interval 
[p, r). In either case, the choice of a low rate of growth (i.e., a high 
proportional dividend payout) would mean a higher dividend rate (in 
dollars per unit time) early in time, but a lower dividend rate during 
later in time because of the slower growth rate. Similarly the choice of 
high growth rate means the opposite in terms of dividend payments in 
dollars per unit time. 

To conclude, we note that for p < g in Case A, the limiting solution 
of the finite case is an optimal solution for the infinite horizon problem 
in the sense that the objective function becomes infinite. However, this 
will not be the situation in Case B; see also Remark 5.7. 

Case B: g > r. 

The limit of the finite horizon optimal solution is to grow at the 
maximum allowable growth rate with 

u = and V — 1 

rc 

all the way. Since t\ disappears in the limit, the stockholders will never 
collect dividends. The firm has become an infinite sink for investment. 
In fact, the limiting solution is a pessimal solution because the value 
of the objective function associated with it is zero. From the point of 
view of optimal control theory, this can be explained as before in Case A 
when p < p. In Case B, we have p > r so that (since r > p throughout 
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the chapter) we have p < g. For this, as noted in Remark 5.5, A(r) 
increases without bound as r increases and, therefore, (5,71) does not 
have a solution. 

As in Case A with p < g, any control policy with earnings growing 
at rate q G [p^g] coupled with a constant v, 0 < -y < 1, has an infinite 
value for the objective function. 

In summary, we note that the only nondegenerate case in the infinite 
horizon problem is when p > ^. In this case, which occurs only in Case 
A, the policy of maximum allowable growth is optimal. On the other 
hand, when p < g, whether in Case A or B, the infinite horizon problem 
has nonunique policies with infinite values for the objective function. 

Before solving a numerical example we will make an interesting re- 
mark concerning Case B. 

Remark 5.7 Let denote the optimal control for the finite 

horizon problem in Case B. Let denote any optimal con- 

trol for the infinite horizon problem in Case B. We already know that 
= oo. Define an infinite horizon control (wooj'^^oo) by extend- 
ing v^) as follows: 



(Uoo,Vao)= lim ('Uy,'uJ). 

T->-oo 

We now note that for our model in Case B, we have 

fim = oo and J(uoo,'?^oo) = 0- (5.77) 

T— >oo 

Obviously (woo'^^oo) is not an optimal control for the infinite horizon 
problem. Since the two terms in (5.77) are not equal, we can say in 
technical terms that J(u,v), regarded as a mapping, is not a closed 
mapping. K we introduce a salvage value Bx(T), B > 0, however, for 
the finite horizon problem, then the new objective function. 



J(u,v) = { 



-u- v)xdt + Bx{T)e-!>'^, if T < oo, 



e ''*(1 — u — v)xdt + lim7--.oo{6^(7’)e ii T = oo, 
is a closed mapping in the sense that 



Mm J(wy, v^) ~ oo and J(uoo, Voo) ~ oo 
T— ^oo 



for the modified model. 




148 



5, Applications to Finance 



Example 5.2 We will now assign numbers to the various parameters in 
the optimal financing problem in order to compute optimal solution. Let 

X = 1000/month, T = 60 months, 

r = 0.15, p — 0.10, g — 0.05, c = 0.98. 

Solution. Since ^ < r, the problem belongs to Case A. We compute 

Ti = -ln[r/(r — p)] -- 10 In 3 ~ 11 months. 

P 

The optimal controls for the problem are 

w* = 0, tG[0,49), 

tz* = 0, -u* = 0, t e [49, 60], 

and the optimal state trajectory is 

i l000eO•®^^ tG[0,49), 
lOOOe^-^®, te [49,60], 

The value of the objective function is 

/•49 f60 

r = j - l/3)(1000)e“ “®‘dt + / lOOOe^-^® • 

io J49 

= 12, 578.75. 

Note that the infinite horizon problem is well defined in this case, since 
g < p and g < r. The optimal controls are 



n* = 0,n*=a/r=l/3, 



and 

PCX) 1 

J = / e"“'*(2/3)(1000)e‘‘“®‘df = 2000/0.15 = 13, 333-. 

Jo 3 
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EXERCISES FOR CHAPTER 5 

5.1 Find the optimal policies for the simple cash balance model (Sec- 
tions 5.1.1 and 5.1.2) with xq = 2, yo = 2, Ui = = 5, T = 

1, a = 0.01, and the following specifications for the interest rates : 

(a) ri(i) = 1/2, r 2 {t) = 1/3. 

(b) ri(i) == t/2, r 2 {t) = 1/3. 

(c) Sketch the optimal policy in (b) in the (^, A 2 /A 1 ) space, like 
in Figure 5.2. 

5.2 Formulate the extension of the model in Section 5.1.3 with finite 
positive bounds U\ and U 2 on the control variables for 

(a) a — 0. 

(b) a > 0. 

[Hint; Adjoin the control constraints to the Hamiltonian in forming 
the Lagrangian. For (b), write u — ui—u^as in (5.10).] 

5.3 It is also possible to guess the solution for Example 5.1 when a > 0. 
Show that the optimum policy remains imchanged if a< 1 — 1/e. 
[Hint: Use an elementary compound interest argument.] 

5.4 Discuss the optimal equity financing model of Section 5.2.1 when 
c = 1. Show that only one control variable is needed. Then solve 
the problem. 

5.5 What happens in the optimal equity financing model when r < p? 
Guess the optimal solution (without actually solving it). 

5.6 When g = r m Case A of the optimal equity financing model, why 
is the limit of the solution not the solution to the infinite horizon 
problem? 

5.7 Let p = 0.12 in Example 5.2. Re-solve the finite horizon problem 
with this new value of g. Also, for the infinite horizon problem, 
state a policy which yields an infinite value for the objective fimc- 
tion. 

5.8 Reformulate the simple cash balance problem of Sections 5.1.1 and 
5.1.2, if the earnings on bonds are paid in cash. 
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5.9 Add a salvage value function 



Bx(T), 

where B > 0, to the objective function in the problem (5.33) and 
analyze the modified problem due to Sethi (1978b). Show how the 
solution changes when B changes from 0 to l/rc. 

5 . 10 * Redo Example 5.1 with the control constraints —1 <u< 1. 

(a) Give reasons why the solution shown in Figme 5.8 is optimal. 

(b) Compute f{t*) in terms of t*. 



(c) Compute J in terms of t*. Find t* that maximizes J by setting 
dJ/dt* = 0. 

[Hint: Because this is a long and tedious calculus problem, you 
may wish to use Mathematica or MAPLE to solve this problem.] 




Figure 5.8: Solution for Exercise 5.10 



5.11 For the solution found in Exercise 5.10, show by using the maxi- 
mmn principle and Exercise 5.10 that the adjoint trajectories are: 



1 Ai(0) 0<t<5, 

Ai(5)e-03(^-^)-e3-o^ 5 < t < 10, 



Ai(i) = 
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and 



M{t) = I 



2 I 1^3-0.3t 

3 3^ ’ 



0<t< f{t*) ^ 6.52, 
f{t*) < t < 10, 



where t* ^ 1.97. Sketches of these functions are shown in Figure 

5.9. 



Ai, A2 




Figure 5.9: Adjoint Trajectories for Exercise 5.11 



5 . 12 * Suppose we extend the model (5.33) to include debt. For this let 
y denote the total debt at time t and w denote the amount of 
debt issued expressed as a proportion of current earnings. Then, 
the state equation for y is 

y = wx, :^(0) = yQ. 

How would you modify the state equation for x and the growth 
constraint (5.31)? Assume i to be the constant interest rate on 
debt, and i <r. 
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5. 13* Find the form of the optimal policy for the following model due to 
Davis and Elzinga (1972): 



max 

u,v 



[ - v)Erdt + P(T)e"'’^ 

JQ 



subject to 



P=k[rE{l-v)-pPl P(0) = Po, 

E = rE[v A u{c - E/P)\ , JF(0) = E, 

and the control constraints 

u>0, cu + v < g/r. 

Here P denotes the price of a stock, E denotes equity per stock and 
k >Q\s a constant. Also, assume r > p > g. This example requires 
the use of the generalized Legendre- Clebsch condition (D.40) in 
Appendix D. 

5.14* Remove the assumption of an arbitrary upper boimd g on the 
growth rate in the financing model of Section 5.2.1 by introducing a 
convex cost associated with the growth rate. With r re-interpreted 
now as the gross rate of return, obtain the net increase in rate of 
earnings by the rate of increase in gross earnings less the cost asso- 
ciated with the growth rate. Also assume c = 1 as in Exercise 5.4. 
Formulate the resulting model and apply the maximum principle 
to find the form of the optimal policy. You may assume the cost 
function to be quadratic in the growth rate to get an explicit form 
for the solution. 




Chapter 6 

Applications to Production 
and Inventory 



Applications of optimization methods to production and inventory prob- 
lems date back at least to the classical EOQ(Economic Order Quantity) 
model or the lot size formula of Harris (1913). The EOQ is essentially 
a static model in the sense that the demand is constant and only a sta^ 
tionary solution is sought. A dynamic version of the lot size model was 
analyzed by Wagner and Whitin (1958). The solution methodology used 
there was dynamic programming. 

An important dynamic production planning model was developed by 
Holt, Modigliani, Muth, and Simon (1960). In their model, referred to as 
the HMMS model, they considered both production costs and inventory 
holding costs over time. They used calculus of variations techniques to 
solve the continuous-time version of their model. 

In Section 6.1, a model of Thompson and Sethi (1980), similar to 
the HMMS model, is formulated and completely solved using optimal 
control theory. The turnpike solution is also obtained when the horizon 
is infinite. 

In Section 6.2, the continuous wheat trading model of Ijiri and 
Thompson (1970) is introduced. In this model a wheat speculator must 
buy and sell wheat in an optimal way in order to take advantage of 
changes in the price of wheat over time. This model permits short-seUing 
of wheat. The model has been further analyzed by Norstrbm (1978) with 
short-selling disallowed. A simple example which illustrates his results 
is presented in Section 6.2.4. 

In Section 6.3, we introduce a warehousing constraint, i.e., an upper 
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bound on the amount of wheat that can be stored, in the wheat trading 
model. In addition to being realistic, the introduction of the warehousing 
constraint helps us to illustrate the concepts of decision and forecast 
horizons by means of examples. This section is expository in nature, but 
theoretical developments of these ideas are available in the literature. 



6.1 A Production-Inventory System 

Many manufacturing enterprises use a production-inventory system to 
manage fluctuations in consumer demand for the product. Such a system 
consists of a manufacturing plant and a finished goods warehouse to store 
those products which are manufactrued but not immediately sold. Once 
a product is made and put into inventory, it incurs inventory holding 
costs of two kinds: (i) costs of physically storing the product, insuring 
it, etc.; and (ii) opportunity cost of having the firm’s money invested 
or tied up in the unsold inventory. The advantages of having products 
in inventory are: first, they are immediately available to meet demand; 
second, by using the warehouse to store excess production during low 
demand periods to be available for sale druing high demand periods. 
This usually permits the use of a smaller manufacturing plant than would 
otherwise be necessary, and also reduces the difficulties of managing the 
system. 

The optimization problem is to balance the benefits of production 
smoothing versus the costs of holding inventory. Some references that ap- 
ply control theory to production and inventory problems are: Sprzeuzk- 
ouski (1967), Hwang, Fan, and Erickson (1967), Pekehnan (1974), Ben- 
soussan, Hurst, andNaslund (1974), Hartl and Sethi (1984a), Feichtinger 
and Hartl (1985a), Stoppler (1985), and Gaimon (1988). Other issues 
such as process improvement, automation, quality, and learning effects 
are also important in production problems. For optimal control appli- 
cations dealing with these issues, see Vickson (1985), Amit and Ilan 
(1990), Li and Rajagopalan (1998), Gaimon (1985a, 1985b), Jprgensen, 
Kort, and Zaccour (1999), and Carrillo and Gaimon (2000). 



6.1.1 The Production-Inventory Model 

We consider a factory producing a single homogeneous good and having 
a finished goods warehouse. To state the model we define the following 
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quantities: 

I{t) = the inventory level at time t (state variable), 

P(t) = the production rate at time t (control variable) , 

S{t) = the sales rate at time t (exogenous variable); 

assumed to be bounded and differentiable for t > 0, 
T = the length of the planning period, 

I = the inventory goal level, 

Iq = the initial inventory level, 

P = the production goal level, 
h — the inventory holding cost coefficient; > 0, 
c = the production cost coefficient; c > 0, 
p = the constant nonnegative discount rate; p > 0. 



The interpretation of the inventory goal level I is that it is a safety 
stock that the company wants to keep on hand. For example, I could 
be two months of average sales or I could be 100 units of the finished 
goods. Similarly, the production goal level P can be interpreted as the 
most efficient level at which it is desired to run the factory. 

With this notation we can now state the conditions of the model. 
The first is the stock-flow differential equation 

i{t) = p{t) - s{t), m = lo, ( 6 . 1 ) 



which says that the inventory at time t is increased by the production 
rate and decreased by the sales rate. The objective fimction of the model 
is: 

J = - if + ■ (* 5 - 2 ) 

The interpretation of the objective function is that we want to keep 
the inventory as close as possible to its goal level /, and also keep the 
production rate P as close as possible to its goal level P. The quadratic 
terms {h/2){I — 1)^ and (c/2)(P — P)^ impose “penalties” for having 
either I or P not being close to its corresponding goal level. 



mm • 
p>o 
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6.1.2 Solution by the Meiximum Principle 

We now associate an adjoint function A with equation (6.1) and can write 
the current-value Hamiltonian function as 



H = (6.3) 

In (6.3), we have used the negative of the (undiscounted) integrand in 
(6,2), since the minimization of J in (6.2) is equivalent to the maximiza- 
tion of — J. 

To apply the Pontryagin maximum principle, we differentiate (6.3) 
and set the resulting expression equal to 0, which gives 



dH 

dP 



= A - c(P - P) = 0. 



Prom this we obtain the decision rule 



P = P + A/c, (6.4) 

as long as the right-hand side is nonnegative. In most cases P is con- 
strained to be nonnegative, so that the form of the optimal control is 

P* = max{P + A/c,0}. (6.5) 

For the rest of this section and the next we shall assume that P is 
large enough so that (6.4) always gives nonnegative production values. 
(The case when P may become negative will be treated in Section 6.1.4.) 
With the assumptions of a sufficiently large P and a sufficiently small /q? 
we have P* = P -\- X/c and we can substitute (6.4) into (6.1) to obtain 

i = p+x/c~s, 7(0) = /o. (6.6) 

The equation for the adjoint variable is easily found to be 
dP 

A = pA - — = pA + fe{/ - i), \{T) = 0. (6.7) 

We see that (6.6) is an initial value problem and (6.7) a terminal 
value problem, so that together these give a two-point boundary value 
problem. We shall employ a method to solve these two equations simul- 
taneously, which works only in some special cases including the present. 
The method is the weU-known trick used to solve simultaneous differen- 
tial equations by differentiation and substitution until one of the vari- 
ables is eliminated. Specifically, we differentiate (6.6) with respect to t, 
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which creates an equation with A in it. We then use (6.7) to eliminate A 
and (6.6) to eliminate A from the resulting equation as follows: 

l = X/c-S = p{X/c) + {h/c){I -i)~S 

- p(J - P + ^) + {h/c){I ~I)-S. 

We rewrite this as 

i-pi-a^I=-a^I-S-p{P-S), (6.8) 

where the constant a is given by 

a = .Jhjl. (6.9) 

We can now solve (6.8) by using the standard method described in 
Appendix A. The auxiliary equation for (6.8) is 

— pm — = 0, 

which has the two real roots 

mi^ {p- -|-4a2y2, m 2 ~ {p + y/ p"^ + 4q2)/2; (6.10) 

note that mi < 0 and m 2 > 0. We can therefore write the general 
solution to (6.8) as 

I(t) = + Q{t), J(0) - Iq, (6.11) 

where Q{t) is a particular integral of (6.8). 

We shall say that Q{t) is a special particular integral of (6.8) if it has 
no additive terms involving and Prom now on we will always 
assume that Q{t) is a special particular integral. 

Although (6.11) has two arbitrary constants a\ and a 2 , it has only 
one boundary condition. To get the other boundary condition we dif- 
ferentiate (6.11), substitute the result into (6.6), and solve for A. We 
obtain 

A(^) = c(miaie""i* + m2a2t^^^ + Q + S' - F), \{T) = 0. (6.12) 

Note that we have imposed the boundary condition on A so that we can 
determine the constants ai and a 2 . 




158 



6. Applications to Production And Inventory 



To do the latter we define two more constants: 

6 1 = lo-Q(O), (6.13) 

6 2 = p-Q{T)-S{T). (6.14) 



We now impose the boundary conditions in (6.11) and (6.12) and solve 
for ai and a 2 as follows: 



y?^^g2miT _ 77X2e(^i+”^2)T' ’ 
„ jji^^(Tni+ni2)T 



(6.15) 

(6.16) 



If we recall that mi is negative and m 2 is positive, then when T is 
sufficiently large so that and are negligible, we can write 



ai 

a2 



h-e~”^T 

m 2 



(6.17) 

(6.18) 



Note that for a large T, is close to zero and, therefore, a 2 is close 

to zero. However, the reason for retaining the exponential term in (6.18) 
is that tt 2 is multiplied by ( 6 . 12 ) which, while small when t is 

small, becomes large and important when t is close to T. 

With these values of ai and a 2 and with (6.4), ( 6 . 11 ), and (6.12), we 
now write the expressions for /, P, and A. We will break each expression 
into three parts: the first part labeled Starting Correction is important 
only when t is small; the second part labeled Turnpike Expression is 
significant for all values of t\ and the third part labeled Ending Correction 
is important only when t is close to T. 



Starting Correction 

/=(6ie"»*)+ 

F=(TOi 6 ie’">‘) + 
A = c(mi6ie'”**)+ 



Turnpike Expression 

{Q)+ 

{Q + S')+ 
c(qas-P)p 



Ending Correction 

(— (019) 

m 2 

(;,2e”"2(t-T)) (g20) 

c( 62 e"*^^^-^^) (6.21) 



Note that if bi = 0, which means Iq = Q{0), then there is no starting 
correction. In other words, Iq = Q(0) is a starting inventory that causes 
the solution to be on the turnpike initially. In the same way, if 62 = 
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then the ending correction vanishes in each of these formulas, and the 
solution stays on the turnpike until the end. 

Expressions (6.19) and (6.20) represent approximate closed-form so- 
lutions for the optimal inventory and production functions / and P as 
long as S is such that the special particular integral Q can be foimd 
explicitly. For such examples of S, see Section 6.1.5. 



6.1.3 The Infinite Horizon Solution 

It is important to show that this solution also makes sense when T oo. 
In this case it is usual to assume that p > 0, and to show that the limit 
of this finite horizon solution as T — >• oo also solves the infinite horizon 
problem. Note that as T — >■ oo, the ending correction disappears because 
defined in (6.16) becomes 0. We now have 

A(i) = c(mi6ie"*i' + Q + 5-F]. (6.22) 

If S is bounded, then Q is bounded, and therefore, limt_oo A(t) is 
bounded. Then for p > 0, 



lim e''’*A(f) = 0. (6.23) 

t—y<x> 

By the sufficiency of the maximum principle conditions (Section 2.4), it 
can be verified that the limiting solution 

I{t) = + Q, P(t) = + Q + 5 (6.24) 

is optimal. If 7(0) = (5(0), the solution is always on the turnpike. Note 
that the triple {/, P, A} = {Q, Q S, c(Q + S — P)} represents a non- 
stationary turnpike. If 7(0) ^ (5(0), then 6i ^ 0 and the expressions 
(6.24) imply that the path of inventory and production only approach 
but never attain the turnpike. 

Note that the solution does not satisfy the sufficiency transversality 
condition when p = 0. In this case, all solutions give an infinite value 
(which is the worst value since we are minimizing) for the cost objective 
function. Our limiting solution also gives an infinite value for the objec- 
tive function. It makes sense to define the limiting solution for p = 0 to 
also be the optimal infinite horizon solution for the imdiscounted case. 
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6.1.4 A Complete Analysis of the Constant Positive S 
Case with Infinite Horizon 

In the last three subsections we ignored the production constraint P >0 
and used (6.4) as the optimal decision rule. Here we shall solve the 
production-inventory problem subject to P > 0, and use (6.5) as the 
optimal production rule. For simplicity of analysis and exposition we 
shall assume also that 5 is a positive constant, T = oo, and p > 0. 

The particular integral for the constant positive S case from Table 
A. 5 is Q(t) — Q, where Q is a constant. Thus, the solution ( 6 . 11 ) of the 
differential equation ( 6 . 8 ) reduces to 

/(t) = aie”“'‘ + a2e“^' + Q. 

Substituting in ( 6 . 8 ) and recognizing that P is a constant so that 5 = 0, 
we obtain 

Q = -^(P-S) + I. (6.25) 

Note that a\ and are given in (6.15) and (6.16) in terms of bi and 62 ? 
which from (6.13), (6.14), and (6.25) are 



61 = /o - / - (p/a‘^)(P - S) and 62 = P - 5. 

The turnpike is defined by the triple {(p/a^) {P-S )-\- /, 5, c{S — P)} 
formed from the turnpike expressions in (6.19), (6.20), and (6.21), re- 
spectively. Note that we could have obtained the turnpike levels directly 
by applying the conditions (3.74) in Chapter 3, which in this case are 

7 = 0, A = 0, and P = P + A/c. (6.26) 

Ji Iq = Q, then the optimal solution stays on the turnpike. JI Iq^ Q, 
then the optimal solution is given by 

P{t) = 4- 5 = mi(/o - + S. (6.27) 

Clearly if /o < Q, this provides a nonnegative optimal production 
throughout. In case To > note first that P{t) increases with t and 

P(0) = mi(lo-Q) + S. (6.28) 

Furthermore, if Iq—Q > —S/mi, we have a negative value for P(0) which 
is infeasible. By (6.5), P*(0) = 0. We can now depict this situation 
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I 




Figure 6.1: Optimal Production and Inventory Levels 



in Figure 6.1. The time i shown in the figure is the time at which 
P + A(£)/c = 0. 

It should be obvious that 

I{t) = Q - — . (6.29) 

mi 

For t <i, we have 

i = -S^I=Io-St, (6.30) 

X==pX-^h{I- /), X(t) = -cP. (6.31) 

We can substitute Iq — St for I in equation (6.31) and solve for A. Note 
that we can easily obtain i as 

= — + (6.32) 

mi S mi 

For t > t, the solution is given by (6.19), (6.20), and (6.21) with t 
replaced by (i — f) and hi = I{i) — Q = —S/mi. 

The solution shown in Figure 6.1 represents a situation where there 
is excess initial inventory, i.e., Iq > Q — S/mi. For Iq < Q — Sjmy the 




162 



6. Applications to Production And Inventory 



optimal control given by P + A/c is nonnegative, and the inventory will 
be driven asymptotically to the level Q. 

In the next section, we return to the model of Section 6.1 and examine 
the behavior of solutions for the case of fluctuating demands, such as 
polynomial or sinusoidal demand functions. 

6.1.5 Special Cases of Time Varying Demands 

We solve some numerical examples of the model described in Section 6.1 
for p = 0 and T < oo. For the first set of examples we assume that S(t) 
is a polynomial of degree 2p or 2p — 1 so that = 0, where 

denotes the ^h time derivative of S with respect to t. In other words, 

S'(t) = ^ + ... + C2p, (6.33) 

where at least one of Co and C\ is not zero. Then, it is easy to show 
that a particular solution of (6.11) is 

Q{t) = / + + ... + (*5-34) 

In Exercise 6.3 the reader is asked to verify this by direct substitution. 

For the second set of examples, we assume that S(t) is a sinusoidal 
of form 

S{t) A sin nt + C, (6.35) 

where A and C are constants. In Exercise 6.4 you are asked to verify 
that a particular solution of (6.8) for this S is 

^ '7T >4 

Q{t) — i 5 ^ cos nt. (6.36) 

It is well known in the theory of differential equations that demands 
that are sums of functions of the form (6.33) and/or (6.35) give rise to 
solutions that are sums of functions of form (6.34) and/or (6.36). 

Example 6.1 Assume P = 30, 7 = 15, T = 8, p = 0, and h — c = I 
so that q; = 1, mi = —1, and m 2 = 1. Assume 

S{t) = t{t - 4)(t - 8) + 30 = - 12^2 + 32t + 30. 

Solution. It is then easy to show from (6.34) that 

Q{t) = — 2At + 53 and Q{t) = 6t — 24. 
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Also from (6.13), (6.14), and (6.15), we have ai 5i = /q — 53 and 
h 2 = —24. Then, from (6.19) and (6.20), 

I{t) = {lo - 53)e-‘ + Q(t) - 24e‘-», 

P{t) = -(/o - 53)e-* + Q{t) + S{t) - 24e*~*. 

In Figure 6.2 the graphs of sales, production, and inventory are drawn 
with Iq = 10 (a small starting inventory), which makes bi = —43. In 
Figure 6.3 the same graphs are drawn with Iq = 50 (a large starting 
inventory), which makes bi = —3. In Figure 6.4 the same graphs are 
drawn with Iq = 30, which makes = —23. Note that initially during 
the time from 0 to 4, the three cases are quite different, but during the 
time from 4 to 8, they are nearly identical. The ending inventory ends 
up being 29 in all three cases. 



5,P,/ 




Time 



Figure 6.2: Solution of Example 6.1 with Iq = 10 



Example 6.2 As another example assume that 

K 

S{t) = A + ^ Cfc sm{7rDkt + Ef,), 

k=l 



(6.37) 
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S,P,I 




Time 

Figure 6.3; Solution of Example 6.1 with Iq = 50 



where the constants A,B,Ck, D^, and Ek are estimated from future 
demand data by means of one of the standard forecasting techniques 
such as those in Brown (1959, 1963). 

Solution. Making use of formulas (6.34) and (6.36) we have the special 
particular integral 



Q(t) = I^ + -Lb + '£ cos{nD^t + E^). (6.38) 

6.2 Continuous Wheat Trading Model 

Consider a firm that buys and sells wheat. The firm’s only assets are 
cash and wheat, and the price of wheat over time is known with certainty. 
The objective of this firm is to buy and seU wheat in order to maximize 
the total value of its assets at the horizon time T. The problem here is 
similar to the simple cash balance model of Section 5.1 except that there 
are nonlinear holding costs associated with storing wheat. An extension 
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S,PJ 




Figure 6.4: Solution of Example 6.1 with 7o = 30 



of this model to one having two control variables appears in Ijiri and 
Thompson (1972). 

6.2.1 The Model 

We introduce the following notation: 

T = the horizon time, 
x(t) = the cash balance in dollars at time t, 
y{t) = the wheat balance in bushels at time t, 

v(t) = the rate of purchase of wheat in bushels per unit time; a 

negative purchase means a sale, 
p{t) = the price of wheat in dollars per bushel at time t, 

r = the constant positive interest rate earned on the cash balance, 

h{y) = the cost of holding y bushels per imit time. 

In this section we permit x and y to go negative, meaning borrowing 
of money and short-selling of wheat are allowed. In the next section we 
disallow short-selling of wheat. 
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The state equations are: 

X =- rx — h(y)—pv^ x{0) = xq^ (6.39) 

y = V, 2/(0) = 2 ^ 0 , (6.40) 

and the control constraints are 

-V 2 < v(t) < Ki, (6.41) 

where Vi and V 2 are nonnegative constants. The objective function is to 

maximize { J = x{T) p{T)y(T) } (6.42) 

subject to (6.39)- (6.41). Note that the problem is in the linear Mayer 
form. 



6.2.2 Solution by the Maximum Principle 

Introduce the adjoint variables Ai and A 2 and define the Hamiltonian 
function 



H = \i [rx — h{y) — pv] + A 2 V. 


(6.43) 


The adjoint equations are: 




Ai = —Air, Ai(T) = 1, 


(6.44) 


A 2 - /i'(2/)Ai, A2(T)-p(r). 


(6.45) 


It is easy to solve (6.44) as 




Ai(0 = 


(6.46) 


and (6.45) as 




Aj(t) = p{T) - h'{y{r))e’-<^-'^'>dT. 


(6.47) 



The interpretation of Xi{t) is that it is the future value (at time T) 
of one dollar held as cash from t to T. Also the interpretation of \ 2 {t) 
is the price at time T of a bushel of wheat less the total future value (at 
time T) of the stream of storage costs incurred to store that bushel of 
wheat from t to T. 

From (6.43) the optimal control is 



v*{t) = bang[-V 2 , A 2 (i) - 



(6.48) 
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In Exercise 6.6 you are asked to provide the interpretation of this optimal 
policy. 

Equations (6.39), (6.40), (6.47), and (6.48) determine the two-point 
boimdary value problem which usually requires a numerical solution pro- 
cedure. In the next section we assume a special form for the storage 
function h(y) to be able to obtain a closed-form solution. 

6.2.3 Complete Solution of a Special Case 

For this special case we assume h{y) = ^\y\, r ~ 0, x(0) = 10, y(0) = 0, 
T = 6, and 

( 3 for 0 < t < 3, 

(6.49) 

4 for 3 < t < 6. 

We will apply the maximum principle developed in Chapter 2 to this 
problem even though h(y) is not differentiable at i/ = 0. The answer we 
obtain can be obtained rigorously by using the maximmn principle for 
models involving nondifferentiable fimctions discussed, e.g., in Clarke 
(1989, Chapter 4) and Feichtinger and Hartl (1985b, 1986, Appendix 
A.3). 

For this case with r = 0, we have Xi{t) = 1 for all t from (6.46) so 
that the TPBVP is 

* = -^|2 /| 2;(0) = 10, (6.50) 

y = V, y(0) = 0, (6.51) 

M{t) = i sgn (y), A2(6) = 4. (6.52) 

For this simple problem it is easy to guess a solution. From the fact that 
Ai = 1, the optimal policy (6.48) reduces to 

v*{t) = bang[-l,l;A2(t) ~ p{t)]. (6.53) 

The graph of the price function is shown in Figure 6.5. Since p(t) is 
increasing, short-selling is never optimal. Since the storage cost is 1/2 
unit per unit time and the wheat price jumps by 1 unit at t = 3, it 
never pays to store wheat more than 2 time tmits. Because t/(0) = 0, we 
have v*(t) = 0 for 0 < t < 1. This obviously must be singular control. 
Suppose we start buying wheat ait* > 1. From (6.52) the rate of buying 
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Figure 6.5: The Price Trajectory (6.49) 

is 1; clearly buying will continue at this rate until t = 3, and no longer. 
In order not to lose money because of storing wheat, it must be sold 
within 2 time units of its purchase. Clearly we should start selling at 
t = 3+ at the maximum rate of 1, and continue until a last sale time t** . 
In order to sell exactly all of the wheat purchased, we must have 

3 - r = C* - 3. (6.54) 

Thus, v*{t) = 0 in the interval [f**,6], which is also singular control. 
With this policy, y(t) > 0 for all t € Prom (6.52), A 2 = 1/2 in 

the interval In order to have a singular control in the interval 

[t**,6], we must have X 2 {t) = 4 in that interval. Also, in order to have 
singular control in [0,i*], we must have A^(^) = 3 in that interval. We 
can now conclude that 

t** - r - 2, (6.55) 

and therefore t* =2 and t** = 4. Thus from (6.52) and (6.53), 

' 

3, 0 < i < 2, 

A2(^) = ^ 2 + f/2, 2<t<4, (6.56) 

4, 4 < t < 6. 
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We can now sketch graphs for v*(t)^ and y*(t) as shown in Fig- 

ure 6.6. In Exercise 6.11 you are asked to show that these trajectories are 
optimal by verifying that the maximum principle necessary conditions 
hold and that these conditions are also sufficient. 



A2 




Figure 6.6; Adjoint Variable, Optimal Policy and Inventory in the Wheat 
Trading Model 
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6.2.4 The Wheat Trading Model with No Short-Selling 

We next consider the wheat trading problem for the case in which short- 
selling is not permitted, which requires that we impose the state con- 
straint y > 0. For simplicity in exposition we consider only a special 
case due to Norstrom (1978), which is a shght modification of the model 
of Section 6.2.3. 

For the present case we assume h(y) = y/2^ r = 0, x(0) = 10, 
2/(0) = 1, = F 2 = 1, T = 3, and 

i ~2t 4- 7 for 0 < t < 2, 

(6.57) 

t + \ for 2 < t < 3. 

The statement of the problem is: 

max { J x(3) -I- p(3)?/(3) = a:(3) + 42/(3)} 
subject to 

< X — —\y —pv, x(0) = 10, (6.58) 

y = v, 2/(0) = 1, 



[ ?;-|-l>0, l~u>0, y>0. 

To solve this problem we use the Lagrangian form of the maximum 



principle given in (4.29). The Hamiltonian is 

If = M(-y/2 - pv) + A 2 V. (6.59) 

The optimal control is 

v*(t) = bang[— 1, 1; A 2 (i) — '^i(^)p(^)l when y > 0. (6.60) 

Whenever y = 0 we must impose 2 / = r? > 0 in order to insure that no 
short-selhng occurs. Therefore, 

v*(t) = bang[0, 1; A 2 (^) — Ai(t)p(t)] when y = 0. (6.61) 

Next we form the Lagrangian 

L = If -h pi(v + 1) -j- /U2(l - (6.62) 
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where fi^-, and 77 satisfy the complementary slackness conditions: 



^1>0 , /7i(t; + 1) = 0, 


(6.63) 


P'2 > 0? ^2(1 - ?^) = 0, 


(6.64) 


?7 > 0, r]y = 0^ r]v = 0. 

Furthermore, the optimal trajectory must satisfy 


(6.65) 


dL 


-^ = X 2 - pXi + Pi -P2 + T)=^0. 
With r = 0, we get Ai — 1 as before, and 


(6.66) 



A2 = -^ = 1/2, A2(3) = 4. (6.67) 



Prom this we see that A2 is always increasing except possibly at a jump 
time. Let i be a time such that there is no jump in the interval (i, 3], 
i.e., t is the time of the last jump. Then, 

A2(t) = t/2 + 5/2 for i<t<3, (6.68) 

and the optimal control from (6.59) or (6.60) is v* = 1, i.e., we buy wheat 
at the maximum rate of 1. The smallest possible value of i is obtained 
by equating A2(^) =p(t), thus 

t/2 + 5/2 =~2t + 7=>i< 1.8. (6.69) 

Since p(t) is decreasing at the start of the problem, it appears that selling 
at the maximum rate of 1, i.e., v* = —1, should be optimal at the start. 
Since the beginning inventory is 7/(0) = 1, selling at the rate of 1 can 
continue only until t = 1, at which time the inventory 7/(1) becomes 0. 
Suppose that we do nothing, i.e., v*(t) = 0 in the interval (1, 1.8]. Then, 
t = 1 is an entry time (see Sections 4.1.1 and 4.2) and t = 1.8 is not an 
entry time. Hence, as in Example 4.3, X 2 (t) is continuous at t = 1.8, and 
therefore A2(i) is given by (6.68) in the interval [1,3], i.e., 

A2(t) = V2 + 5/2 for l<t<3. (6.70) 

Using (6.66) with Ai = 1 in the interval (1, 1.8] and 7;* = 0 so that 
Ti — wo have 



X2 - P ^ - T2 + V = ^2 - P + V ~ 
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and consequently 



p(t) = p(t) - A2(i) for t G (1, 1.8]. (6.71) 

Since ht = 0, the jump condition in (4.29) for the Hamiltonian at 
r = 1 reduces to 

H[x*{l), A(l-), 1] = H[x*(l), «*(!+), A(l+), 1). 

Prom the definition of the Hamiltonian H in (6.59), we can rewrite the 
condition as 

Ai(1-)[-3/(1)/2 + A2(1-)^;*(1-) = 

Ai(l+)ht/(l)/2 - p(l+K(l+)! + A2(1+K(1+). 

Since Xi(t) = 1 for all t, the above condition reduces to 

-p(l“)'y*(l“) + A2(1 ~)'j;*( 1“) == -p(l+)i;*(l+) + A2(r*^)'y*(l+). 

Substituting the values of p{l~) = p(P^) = 5 from (6.57), A 2 (l^) = 3 
from (6.70), and ?;*(1+) = 0 and ?;*(1~) = —1 from the above discussion, 
we obtain 

-5(-l) + A2(1^)(-1) = -5(0) + 3(0) = 0=> A2(1“) = 5. (6.72) 

We can now use the jump condition in (4.29) on the adjoint variables 
to obtain 

A2(1~) = A2(1+) + C(l) =s> C(l) = A2(1-) - Aa(l+) = 5-3 = 2. 

It is important to note that in the interval [1, 1.8], the optimal control 
condition (6.61) holds, justifying our supposition that ?;* = 0 in this 
interval. Furthermore, using (6.72) and (6.67), 

X 2 {t) = t!2 + 9/2 for t G [0, l), (6.73) 

and the optimal control condition (6.60) holds, justifying our supposition 
that V* — —I in this interval. The graphs of A 2 (^), p{t), and v*{t) are 
displayed in Figure 6.7. To complete the solution of the problem, you are 
asked to determine the values of V these various intervals. 
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P* ^2 




Figure 6.7: Adjoint Trajectory and Optimal Policy for the Wheat 
Trading Model 



6.3 Decision Horizons and Forecast Horizons 

In some dynamic problems it is possible to show that the optimal deci- 
sions during an initial positive time interval are either partially or wholly 
independent of the data from some future time onwards. In such cases, 
a forecast of the future data needs to be made only as far as that time 
to make optimal decisions in the initial time interval. The initial time 
interval is called the decision horizon and the time up to which data is re- 
quired to make the optimal decisions during the decision horizon is called 
the forecast horizon] see Bes and Sethi (1988), Bensoussan, Crouhy, and 
Proth (1983), and Haurie and Sethi (1984) for details on these concepts. 
Whenever they exist, these horizons naturally decompose the problem 
into a series of smaller problems. 

If the optimal decisions during the decision horizon are completely 
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independent of the data beyond the forecast horizon, then the latter 
is called a strong forecast horizon. If, on the other hand, some mild 
restrictions on the data after the forecast horizon are required in order 
to keep the optimal decisions during the decision horizon unaffected, 
then it is called a weak forecast horizon. 

In this section we shall demonstrate these concepts in the context of 
the wheat trading model of the previous section; see Sethi and Thompson 
(1982). In Section 6.3.1 we shall obtain a decision horizon for the model 
of Section 6.2.4 which is also a weak forecast horizon. In Section 6.3.2 
we modify the wheat trading model by adding a warehousing constraint. 
For the new problem we obtain a decision horizon and a strong forecast 
horizon. See also Rempala (1986) and Hartl (1986a, 1988a) for further 
research in the context of the wheat trading model. 

In what follows we obtain these horizons and verify them for some 
examples with different forecast data. For more details and proofs in 
other situations including more general ones, see Modigliani and Hohn 
(1955), Lieber (1973), Pekelman (1974, 1975, 1979), Kleindorfer and 
Lieber (1979), Vanthienen (1975), Morton (1978), Lundin and Morton 
(1975), Rempala and Sethi (1988, 1992), Hartl (1989a), and Sethi (1990). 

6.3.1 Horizons for the Wheat Trading Model 

For the model of Section 6.2.4, we wiU demonstrate that t = 1 is a 
decision horizon as well as a weak forecast horizon. In Figure 6.8 we 
have redrawn Figure 6.7 with a new price trajectory in the time interval 
[1,3]. Also in the figure, we have extended the initial A 2 trajectory 
and labeled it the price shield. Its significance is that, as long as the 
new price trajectory in the interval [1,3] stays below the price shield, 
the optimal solution in the interval [0, 1], which is the decision horizon, 
remains unchanged, i.e., it is optimal to seU throughout the interval. The 
restriction that p{t) must stay below the price shield in [1, 3] is the reason 
that t == 1 is a weak forecast horizon. The optimality of the control shown 
in Figure 6.8 can be concluded by obtaining the adjoint trajectory in the 
interval [1,3] as a straight line with slope 1/2 with its terminal value 
^ 2 ( 3 ) = p(3). This way of drawing the adjoint trajectory is correct as 
long as the corresponding policy does not violate the inventory constraint 
y(t) > 0. If the inventory constraint is violated, then the trajectory 
will contain jumps, and to obtain it is more comphcated. However, the 
decision horizon and weak forecast horizon still occur att= 1. Moreover, 
if we let T > 1 be any finite horizon and assume p{t) in the interval [1, T] 
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is always below the price shield line of Figure 6.8 extended to T, then 
the policy of selling at the maximum rate in the interval [0, 1] remains 
optimal. It is in this sense that the policy of selling in the interval [0, Ij 
can be considered optimal for the infinite horizon problem as long as p(t) 
stays below the extended price shield. Suppose for instance that it is 
known with certainty that p(t) < p for all t. Then, the intersection time 
of the price shield and p, which isi= 2p— 9, shown as £ = 2 for p = 5.5 in 
Figure 6.8, defines a “stronger” forecast horizon than the weak forecast 
horizon of 1, 







Figure 6.8: Decision Horizon and Optimal Policy for the Wheat Trading 
Model 



6.3.2 Horizons for the Wheat Trading Model with 
Warehousing Constraint 

In order to give an example in which a strong forecast horizon occirrs, 
we modify the example of Section 6.2.4 by adding the warehousing con- 
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straint < 1 or 



1 - 2 / > 0 , 



(6.74) 



changing the terminal time to T = 4, and defining the price trajectory 
to be 



P(i) = { 



— 4" 7 for t G [0, 2) , 
t + 1 for t G [2, 4]. 



(6.75) 



The Hamiltonian of the new problem is unchanged and is given in 
(6.59). Furthermore, Ai = 1. The optimal control is defined in three 
parts as; 



v*(t) = bang[-l, 1; \ 2 {t) — p(t)] when 0 < 2 ; < 1, (6.76) 

?;*(t) = bang [0, 1; \ 2 {t) — p{t)] when 2/ = 0, (6.77) 

v*{i) = bang[— 1, 0; A 2 (t) -- p(t)J when 2 / = 1. (6.78) 

Defining a Lagrange multiplier f3 for the derivative of (6.74), i.e., for 
—y=—v> 0, we form the Lagrangian 

L = H + 1) +P 2 i^ -v) Apv A l3(-v), (6.79) 

where Pi,P2i P satisfy (6.63)-(6.65) and f3 satisfies 

/3 > 0, ~y) = t), j3v^ 0. (6.80) 

Furthermore, the optimal trajectory must satisfy 

dL 

— = A 2 -p + Ml - /X 2 + ??-/? = 0. (6.81) 

As before, Ai = 1 and A 2 satisfies 

A 2 = 1/2, A2(4) = p(4) = 5. (6.82) 

Let i be the time of the last jump of the adjoint function \ 2 (t) before 
the terminal time T = 4 . Then, 

A 2 (t) = t/2 + 3 for i<t< 4. (6.83) 



The graph of (6.83) intersects the price trajectory at t = 8/5 as 
shown in Figure 6.9. It also stays above the price trajectory in the 
interval [8/5,4] so that, if there were no warehousing constraint (6.74), 
the optimal decision in this interval would be to buy at the maximum 
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P* ^ 




Figure 6.9: Optimal Policy and Horizons for the Wheat Trading Model 
with Warehouse Constraint 



rate. However, with the constraint (6.74), this is not possible. Thus 
t > 8/5, since A 2 will have a jmnp in the interval [8/5, 4]. 

To find the actual value of t we must insert a line of slope 1/2 above 
the minimum price at t = 2 in such a way that its two intersection points 
with the price trajectory are exactly one time imit (the time required to 
fill up the warehouse) apart. Thus using (6.75), t must satisfy 



-2(t-l) + 7+l/2 = t + l, 



which yields t — 17/6. 

The rest of the analysis for determining A 2 including the jump con- 
ditions is similar to that given in Section 6.2.4. Thus, 
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i/2 + 9/2 



\2(i) = 



t/2 + 29/12 



t/2 + 3 



for t e [0, 1), 
for t G [1, 17/6), 
forte [17/6,4]. 



(6.84) 



Given (6.84), the optimal policy is given by (6.76)-(6.78) and is shown 
in Figure 6.9. 

To complete the maximum principle we must derive expressions for 
the Lagrange multipliers in the four intervals shown in Figme 6.9. 



Interval [0, 1 ) : jji 2 = f] = 13 = 0, = p — X 2 > 0, 

V* = 1, 0<y* <1. 

Interval [1, 11 / 6 ) : = ^2 = /? = 0? ^ = P — ^2 > 0, 

V* =0, y* = 0. 

Interval [11/16, 17/6) ; = 77 = /? = 0, /Li 2 = A 2 — p > 0, 

V* = 1, 0<y* <1. 

Interval [17/6, 4] ; = /i ,2 = 77 = 0, /? = A 2 — p > 0, 

u* = 0,y* = 1. 

In Exercise 6.16 you are asked to solve another variant of this prob- 
lem. 

For the example in Figure 6.9 we have labeled t = 1 as a decision 
horizon and t= 17/6 as a strong forecast horizon. By that we mean that 
the optimal decision in [0, 1] continues to be to sell at the maximum rate 
regardless of the price trajectory p(t) for t > 17/6. Because t = 17/6 is 
a strong forecast horizon, we can terminate the price shield at that time 
as shown in the figure. 

In order to illustrate the statements in the previous paragraph, we 
consider two examples of price changes after t= 17/6. 
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Example 6.3 Assume the price trajectory to be 



p{t) = 



< 



-2t + 7 

t + 1 



2U/7 - 44/7 



for t € [0,2), 
for t e [2, 17/6), 
for t € [17/6,4], 



which is sketched in Figure 6.10. Note that the price trajectory up to 
time 17/6 is the same as before, and the price after time 17/6 goes 
above the extension of the price shield in Figure 6.9. 



P » ^^2 




Figure 6.10: Optimal Policy and Horizons for Example 6.3 

Solution. The new A 2 trajectory is shown in Figure 6.10, which is the 
same as before for t < 17/6, and after that it is X^i^) = t/2 + 6 for 
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t e [17/6, 4]. The optimal policy is as shown in Figure 6.10, and as 
previously asserted, the optimal policy in [0,1) remains unchanged. In 
Exercise 6.16 you are asked to verify the maximum principle for the 
solution of Figure 6.10. 

Example 6.4 Assume the price trajectory to be 

-2t + 7 for ie [0,2), 

?(*) = ■! t + 1 for f G [2,17/6), 

t/2 + 21/4 fort €[17/6,4], 

which is sketched in Figure 6.11. 




Figure 6.11: Optimal Policy and Horizons for Example 6.4 
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Solution. Again the price trajectory is the same up to time 17/6, but 
the price after time 17/6 is declining. This changes the optimal policy 
in the time interval [1, 17/6), but the optimal policy will still be to sell 
in [0, 1). 

As in the beginning of the section, we solve (6.82) to obtain A 2 (<) = 
t/2 + 5/4 for ii < t < 4, where > 1 is the time of the last jump 
which is to be determined. It is intuitively clear that some profit can be 
made by buying and selling to take advantage of the price rise between 
t = 2 and t = 17/6. For this, the \ 2 {t) trajectory must cross the price 
trajectory between times 2 and 17/6 as shown in Figure 6.11, and the 
inventory y must go to 0 between times 17/6 and 4 so that A 2 can jump 
downward to satisfy the ending condition A2(4) = p(4) = 13/4. Since 
we must buy and sell equal amounts, the point of intersection of the A 2 
trajectory with the rising price segment, i.e., t\ — a, must be exactly in 
the middle of the two other intersection points, £1 and £1 — 2a, of A 2 with 
the two declining price trajectories. Thus, £j and a must satisfy: 

— 2(£i — 2a) + 7 + a/2 — (£1 — a) + 1, 

(£j — a) + 1 + a/2 — —£ 1/2 + 21/ 4. 

These can be solved to yield £1 = 163/54 and a = 5/9. The times 
tl,£l — a, and t\ — 2a are shown in Figure 6.11. The A 2 trajectory is 
given by 



V2 + 9/2 for [0,1), 



A2(^) = 



t/2 + 241/108 for t G [1, 163/54), 



t/2 + 5/4 for tG [163/54,4]. 



Evaluation of the Lagrange multiphers and verification of the maximum 
principle is similar to that for the case in Figure 6.9. 

In Section 6.3 we have given several examples of decision horizons 
and weak and strong forecast horizons. In Section 6.3.1 we foimd a 
decision horizon which was also a weak forecast horizon, and it occurred 
exactly when y{t) — 0. We also introduced the idea of a price shield in 
that section. In Section 6.3.2 we imposed a warehousing constraint and 
obtained the same decision horizon and a strong forecast horizon, which 
occurred when y{t) = 1. 
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Note that if we had solved the problem with T = I, then 2 /*(l) = 0; 
and if we had solved the problem with T — 17/6, then y*{l) = 0 and 
y*{l7/6) = 1. The latter problem has the smallest T such that both 
y* ~0 and y* = 1 occur for t > 0, given the price trajectory. This is one 
of the ways that time 17/6 can be foimd to be a forecast horizon which 
follows the decision horizon at time 1. There are other ways to find 
strong forecast horizons; see Pekelman (1974, 1975, 1979), Lundin and 
Morton (1975), Morton (1978), Kleindorfer and Lieber (1979), Sethi and 
Chand (1979, 1981), Chand and Sethi (1982, 1983, 1990), Bensoussan, 
Crouhy, and Proth (1983), Teng, Thompson, and Sethi (1984), Bean and 
Smith (1984), Bes and Sethi (1988), Bhaskaran and Sethi (1987, 1988), 
Chand, Sethi, and Proth (1990), Bylka and Sethi (1992), Bylka, Sethi, 
and Sorger (1992), and Chand, Sethi, and Sorger (1992). 

EXERCISES FOR CHAPTER 6 

6.1 Verify the expressions for ai and U 2 given in (6.15) and (6.16). 

6.2 For the model of Section 6.1.4, derive the turnpike triple by using 

the conditions in (6.26). 

6.3 Verify (6.34). Note that p = 0 is assumed in Section 6.1.5. 

6.4 Verify (6.36). Again assume p = 0. 

6.5 Given the demand function 

S = t{t~ 4)(t - 8){t - I2){t - 16) + 30, 
p = 0, / = 15, T = 16, and a = 1, obtain Q(t) from (6.34). 

6.6 Give an intuitive interpretation of (6.48). 

6.7 Assume that there is a transaction cost when v units of wheat 
are bought or sold in the model of Section 6.2.1. Derive the form 
of the optimal policy. 

6.8 Set up the two-point boundary value problem for Exercise 6.7 with 
/? = .05, h(y) = ( 1 / 2 ) 2 /^, and the remaining values of parameters 
as in the model of Section 6.2.3. 

6.9 Use a computer to solve the TPBVP of Exercise 6.8 by using the 
flow chart given in Figure 6.12. (This method is called the shooting 
method.) You may use EXCEL as illustrated in Section 2.5. 
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Figure 6.12: The Flow Chart for Exercise 6.9 



6.10 In Exercise 6.7, assume T = 10, x(0) = 10, y{0) = 0, /? — 1/18, 
Hy) — (1/2)?/^ 5 El zzz 1/2 = oo, ~ 0, and p(t) = 10 + 1. Solve the 
resulting TPBVP to obtain the optimal control in closed form. 

6.11 Show that the solution obtained for the problem in Section 6.2.3 
satisfies the necessary conditions of the maximum principle. Con- 
clude the optimahty of the solution by showing that the maximum 
principle is sufficient. 

6.12 Re-solve the problem of Section 6.2.3 with Ei = 2 and V 2 = 1- 

6.13 Compute the optimal trajectories for pi, P 2 ^ V f^>r the model 
in Section 6.2.4. 

6.14 Solve the model in Section 6.2.4 with each of the following condi- 
tions: 

(а) j/(0) = 2. 

(б) T = 10 and p(t) = 2t — 2 for 3 < t < 10. 

6.15 Re-solve the model of Section 6.3.2 with ^(0) = 1/2 and the addi- 
tional warehousing constraint y <1/2. 

6.16 Verify that the solution shown in Figure 6.10 satisfies the maximum 
principle. 
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6.17 Solve and interpret the following production planning problem with 
linear inventory holding costs: 



max 



J= f -[/i7 + |pVi 
Jo 

subject to 



(6.85) 



i = p, 7(0) = 0, 7(T) = B;0<B< hT^/2c, 
P >0 and / > 0. 



6.18 Re-solve Exercise 6.17 with the state equation I = P — S, 1(0) = 
/o > 0, and I(T) not constrained. Also drop the production con- 
straint P > 0 for simplicity. (Note that negative production can 
and will occur when initial inventory Iq is too large. Of course, how 
large is too large depends on the other parameters of the problem.) 




Chapter 7 



Applications to Marketing 



Over the years, a number of applications of optimal control theory have 
been made to the field of marketing. Many of these applications deal 
with the problem of finding or characterizing an optimal advertising 
price over time. Others deal with the problem of determining optimal 
price and quality over time, in addition to or without advertising. The 
reader is referred to Sethi (1977a) and Feichtinger, Hartl, and Sethi 
(1994) for comprehensive reviews on dynamic optimal control problems 
in advertising and related problems. In this chapter we discuss optimal 
advertising policies for two of the well-known models called the Nerlove- 
Arrow model and the Vidale- Wolfe model. 

To describe the specific problems under consideration, let us assume 
that a firm has some way of knowing or estimating the dynamics of sales 
and advertising. Such knowledge is expressed in terms of a difiFerential 
equation with goodwill or rate of sales as the state variable and the 
rate of advertising expenditures as the control variable. We assume 
that the firm wishes to maximize an objective function (the criterion 
function) which reflects its profit motives expressed in terms of sales and 
advertising rates. The optimal control problem is to find an advertising 
policy which maximizes the firm’s objective fimction. 

The plan of this chapter is as follows. Section 7.1 will cover the 
Nerlove- Arrow model as well as a nonlinear extension of it. Section 7.2 
deals with the Vidale- Wolfe advertising model and its detailed analy- 
sis using Green’s theorem in conjunction with the maximum principle. 
The switching-point analysis for this problem is a good example of the 
reverse-time construction technique used earlier in Chapters 4 and 5. Ex- 
tensions of these models to multi-state problems are treated in Turner 
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and Neuman (1976) and Srinivasan (1976). 

7.1 The Nerlove- Arrow Advertising Model 

The belief that advertising expenditures by a firm affects its present and 
future sales, and hence its present and future net revenues, has led a 
number of economists including Nerlove and Arrow (1962) to treat ad- 
vertising as an investment in building up some sort of advertising capital 
usually called goodwill. Furthermore, the stock of goodwill depreciates 
over time. Vidale and Wolfe (1957), Palda (1964), and others present 
empirical evidence that the effects of advertising linger on but diminish 
over time. 

Goodwill may be created by adding new customers or by altering the 
tastes and preferences of consumers and thus changing the demand func- 
tion for the firm’s product. The goodwill depreciates over time because 
consumers “drift” to other brands as a result of advertising by competing 
firms and introduction of new products and/or new brands, etc. 

7.1.1 The Model 

Let G{t) > 0 denote the stock of goodwill at time t. The price of (or cost 
of producing) one unit of goodwill is $1 so that a dollar spent on current 
advertising increases goodwill by one unit. It is assumed that the stock 
of goodwill depreciates over time at a constant proportional rate S, so 
that 

G = u-6G, G(0) = Go, (7.1) 

where u = u{t) > 0 is the advertising effort at time t measured in dollars 
per unit time. In economic terms, equation (7.1) states that the net 
investment in goodwill is the difference between gross investment u{t) 
and depreciation 6G(t). 

To formulate the optimal control problem for a monopolistic firm, 
assume that the rate of sales S(t) depends on the stock of goodwill G(t), 
the price p(t), and other exogenous variables Z{t), such as consumer 
income, population size, etc. Thus, 

S = S{p,G,Z). (7.2) 

Assuming the rate of total production costs is c(S), we can write the 
total revenue net of production costs as 

R(p,G, Z)=pS{p,G,Z)-c{S). 



(7.3) 
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The revenue net of advertising expenditure is therefore R(p, Z) — u. 
We assume that the firm wants to maximize the present value of net 
revenue streams discounted at a fixed rate p, i.e., 

/ e~^^[R(p, G, Z) - w] dtj (7.4) 

subject to (7.1). 

Since the only place that p occurs is in the integrand, we can max- 
imize J by first maximizing R with respect to price p holding G fixed, 
and then maximize the result with respect to u. Thus, 



*.o 

op op op 



(7.5) 



which implicitly gives the optimal price p*(t) = p(G^(t), Z{t)), Defining 
77 = —{p/S){dS/dp) as the elasticity of demand with respect to price, 
we can rewrite condition (7.5) as 






(7.6) 



which is the usual price formula for a monopolist, known sometimes as 
the Amoroso-Robinson relation. In words, the formula means that the 
marginal revenue (77 — 1 ) 79/77 must equal the marginal cost c'{S). See, 
e.g., Cohen and Cyert (1965, p.l89). 

Defining 7t{G,Z) = R(p*,G,Z), the objective function in (7.4) can 
be rewritten as 



max ij— e ^^[t:{G^ Z) — u]dt >. 

I Jo J 

For convenience, we assume Z to be a given constant and restate the 
optimal control problem which we have just formulated: 

max 1'^ — J e~^^[7T(G) — u] 

\ subject to 

G = u-h, G{0) = Go. 



(7.7) 
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7.1.2 Solution by the Maximum Principle 

While Nerlove and Arrow (1962) used calculus of variations, we use 
Pontryagin’s maximum principle to derive their results. We form the 
current-value Hamiltonian 



H = 7t(G) - u + A[« - SG] (7.8) 



with the current-value adjoint variable A satisfying the differential equa^ 
tion 



. Off dTT 



(7.9) 



and the condition that 



lim e = 0. 

t— >+oo 



(7.10) 



Recall from Section 3.5 that this limit condition is only a sufficient con- 
dition. 

The adjoint variable X(t) is the shadow price associated with the 
goodwill at time t. Thus, the Hamiltonian in (7.8) can be interpreted 
as the dynamic profit rate which consists of two terms: (i) the current 
net profit rate (7t(G) — u) and (ii) the value XG = X[u — 8G] of the 
new goodwill created by advertising at rate u. Also, equation (7.9) 
corresponds to the usual equilibrium relation for investment in capital 
goods; see Arrow and Kurz (1970) and Jacquemin (1973). It states that 
the marginal opportunity cost X{p+6)dt oi investment in goodwill should 
equal the sum of the marginal profit (d'K / dG)dt from increased goodwill 
and the capital gain dX := Xdt. 

Defining /? = {G / S){dS / dG) as the elasticity of demand with respect 
to goodwill and using (7.3), (7.5), and (7.9), we can derive (see Exercise 
7.3) 

G* = 

r)[{p + 6)X - A] 

This is exactly the same result derived by Jacquemin (1973) with the in- 
terpretation that at the optimum, the ratio of goodwill to sales revenue 
pS is directly proportional to the goodwill elasticity, inversely propor- 
tional to the price elasticity, and inversely proportional to the sum of 
the marginal opportunity cost A(p + 6) of investment plus the rate at 
which the potential contribution of a unit of goodwill to profits becomes 
its past contribution (—A). 



(7.11) 
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We use (3.74) to obtain the optimal long-run stationary equilibrium 
or turnpike {G,u, A}. That is, we obtain A = A = 1 from (7.8) by using 
dH/du = 0. We then set A = A = 1 and A = 0 in (7.9). Finally, from 
(7.11) and (7.9), G or also the singular level G^ can be obtained as 

r = r* = P + ^ 

r]{p + S) dw/ dG ' 

The property of G is that the optimal policy is to go to G as fast 
as possible. If Gq < G, it is optimal to jump instantaneously to G by 
applying an appropriate impulse at t = 0 and then set u*{t) = u = 6G 
for t > 0. Gq > G, the optimal control u*{t) = 0 until the stock of 
goodwill depreciates to the level G, at which time the control switches 
to u*(t) = SG and stays at this level to maintain the level G of goodwill. 
This optimal policy is graphed in Figiue 7.1 for these two different initial 
conditions. 




G 




Figure 7.1: Optimal Policies in the Nerlove- Arrow Model 

For a time-dependent Z, however, G{t) = G{Z{t)) wiU be a function 
of time. To maintain this level of G{t), the required control is u{t) = 
6G{t)-\-G{t). If G(t) is decreasing sufficiently fast, then u{t) may become 
negative and thus infeasible. If u(t) > 0 for all then the optimal 
policy is as before. However, suppose u(t) is infeasible in the interval 
[^ 1 ,^ 2 ] shown in Figiue 7.2. In such a case, it is feasible to set u{t) = 
u{t) ior t < ti; at t = ti (which is point A in Figure 7.2) we can no 
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G{t) 




Figure 7.2: A Case of a Time-Dependent Turnpike and the Nature of 
Optimal Control 



longer stay on the turnpike and must set u(t) = 0 until we hit the 
turnpike again (at point B in Figure 7,2). However, such a policy is not 
necessarily optimal. For instance, suppose we leave the turnpike at point 
C anticipating the infeasibility at point A. The new path CDEB may be 
better than the old path CAB. Roughly the reason this may happen is 
that path CDEB is “closer” to the turnpike than CAB. The picture in 
Figure 7.2 illustrates such a case. The optimal policy is the one that is 
“closest” to the turnpike. This discussion will become clearer in Section 
7.2.2, when a similar situation arises in connection with the Vidale- Wolfe 
model. For further details, see Sethi (1977b) and Breakwell (1968). 

From the point of view of control theory, the Nerlove-Arrow model 
is an example involving bang- bang and impulse controls followed by a 
singular control, which arises in a class of optimal control problems of 
Model Type (b) in Table 3.3 that are linear in control. From the point 
of view of advertising, it is a seminal paper in spite of the fact that its 
bang-bang policy is not universally accepted as a realistic advertising 
policy. To avoid such policies we discuss a nonlinear generalization. 
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7.1.3 A Nonlinear Extension 



Nonlinear extensions of the Nerlove- Arrow model amount to making the 
objective function nonlinear in advertising. Gould (1970) extended the 
model by assuming a nonlinear cost of advertising effort. Jacquemin 
(1973) assumed that the demand fimction S also depends explicitly on 
advertising effort u. As a result the function tt in the objective function 
of (7.7) becomes 

7T = 7t(w, G). 

Note that we have eliminated the exogenous variable Z for convenience. 

We use a simpler version of Jacquemin’s model analyzed by Bensous- 
san et al. (1974). They assumed that the function n(u,G) is separable, 
i.e., 

7r{u,G) = 7Ti(u) +7T2(G), 

where G > 0 and tti and tt 2 are assumed to be increasing strictly concave 
functions, i.e., vri > 0, > 0, tt" < 0, and 7T2 < 0. Note that this can 

also be interpreted in the sense of Gould (1970) with advertising cost 
c{u) = w — 7Ti(u). The new optimal control problem is: 



max I J = / e + 7T2(G) — u]dt\ 

u>o [ Jo J 

subject to 

G = u-6G, G{0) = Go. 



(7.13) 



The Hamiltonian for the problem is 



H = 7Ti(u) -h 7T2(G) - u -h A(u - 6G), (7.14) 



where 

A = pA-7t'2(G) + ^A. (7.15) 

Differentiating H with respect to u gives 

|^ = 7r'i(«)-l+A = 0. (7.16) 

Since tt" < 0, we can invert tTj to solve (7.16) for w as a function of A. 
Thus, 

U = /i(A). (7.17) 

In Exercise 7.9 you will be asked to show that fi is an increasing function 
and /i(A) > 0 for A < 1. The function /i in (7.17) permits us to 
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rewrite the state equation in (7.13) and the adjoint equation (7.15) as 
the following initial value problem: 



I G + SG = fi(X), G(0) = Go, 

y —A + (p + 5) A = '^2(^)5 "^(^) ~ -^05 



(7.18) 



however, with the value of Aq unknown, and to be chosen according to 
the discussion that follows. We also note that > 0 for (7 > 0 and 

7T2(C) is a decreasing function. 



G 




Figure 7.3: Phase Diagram of System (7.18) for Problem (7.13) 

The system (7.18) can be analyzed geometrically in (A, G) space. 
This method is called the -phase diagram method described next. Using 
the above noted properties of the functions /i and 7T2 and setting A = 0 
and = 0, we can draw the two equations in (7.18) as shown dotted 
in the phase diagram of Figure 7.3. The intersection of the two curves 
SG = /i(A) and (p + 6)\ = rc2(^) point (A, G) with A satisfying 
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0 < A < 1; see Exercise 7.11. These two curves divide the plane into 
four regions marked I, II, III, and IV in the figure. Note that {(5, u, A}, 
where u = 6G, is the turnpike for this problem (see Section 3.5). 

We note that 



in Regions I and II, 
in Regions III and IV, 
in Regions I and III, 
in Regions II and IV, 
which implies that 



5G-/i(A) >0, 
5G-/i(A) <0, 

(p + 5)A-7r'2(G) >0, 

{p + < 5 )^ ~ ^ 2 (^) 



in Region I, 
in Region II, 
in Region III, 
in Region IV, 



G decreasing and A increasing, 
G decreasing and A decreasing, 
G increasing and A increasing, 
G increasing and A decreasing. 



Because of these conditions it is clear that for a given Go, a choice of 
Ao such that (Ao,Gq) is in Regions II and III, will not lead to a path 
converging to the turnpike point (A, G) . On the other hand, the choice 
of (Ao,Go) in Region I when Gq > G oi (Ao,Go) in Region IV when 
Go < G, can give a path that converges to (A,G). Prom a result in 
Coddington and Levinson (1955), it can be shown that at least in the 
neighborhood of (A, G), there exists a locus of optimum starting points 
(Ao, Go). A detailed analysis leads to the saddle point path as shown 
darkened in Figure 7.3. To find an optimal solution for a given Go, 
we find the corresponding Ao on the saddle point path and then the 
optimal trajectory graphed in the (A, G)-space moves along the path in 
the direction shown by the arrows. 

Given Go > G, we choose Ao on the saddle point path in Region I 
of Figure 7.3. Clearly the initial control u*(0) = /i(Ao). Furthermore, 
A(t) is increasing and by (7.17), u(t) is increasing, so that in this case 
the optimal policy is to advertise at a low rate initially and gradually 
increase advertising to the tiunpike level u = 6G. If Go < G, it can 
be shown similarly that the optimal policy is to advertise most heavily 
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in the beginning and gradually decrease it to the turnpike level w as G 
approaches G. 

Note that the approach to the equilibrium G is no longer via the 
bang-bang control as in the Nerlove-Arrow model. This, of course, is 
what one would expect when a model is made nonlinear with respect to 
the control variable u. 

An alternate phase diagram procedure would be to solve the problem 
in (u,G) space, as carried out in Feichtinger and Hartl (1986, p.319). 
The procedure requires making use of (7.15) and (7.16) to derive the 
differential equation for u. This derivation is left as Exercise 7.10. 

7.2 The Vidale- Wolfe Advertising Model 

We now present the analysis of the Vidale- Wolfe advertising model 
which, in contrast to the Nerlove-Arrow model, does not make use of 
the idea of advertising goodwill; see Vidale and Wolfe (1957) and Sethi 
(1973a, 1974b). Instead the model exploits the closely related notion 
that the effect of advertising tends to persist, but with diminishing ef- 
fect, through subsequent time periods. This carryover effect is modeled 
explicitly by means of a differential equation that gives the relationship 
between sales and advertising. 

Vidale and Wolfe argued that changes in rate of sales of a product 
depend on two effects: response to advertising which acts (via the re- 
sponse constant a) on the unsold portion of the market, and loss due to 
forgetting which acts (via the decay constant h) on the sold portion of 
the market. Let M{t), known as the saturation level or market poten- 
tial, denote the maximum potential rate of sales at time t. Let S(t) be 
the actual rate of sales at time t. Then, the Vidale- Wolfe model for a 
monopolistic firm can be stated as 

S = aun-4-)-bS. (7.19) 

The important feature of this equation, which distinguishes it from 
the Nerlove-Arrow equation (7.1), is the idea of the finite saturation level 
M. The Vidale- Wolfe model exhibits diminishing returns to the level of 
advertising as a direct consequence of this saturation phenomenon. Note 
that when M is infinitely large, the saturation phenomenon disappears, 
reducing (7.19) to the equation (with constant returns to advertising) 
similar to the Nerlove-Arrow equation (7.1). Nerlove and Arrow, on the 
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other hand, include the idea of diminishing returns to advertising in their 
model by making the sales S in (7.2) a concave function of goodwill. 

Vidale and Wolfe based their model on the results of several experi- 
mental studies of advertising effectiveness, which are described in Vidale 
and Wolfe (1957). 



7.2.1 Optimal Control Formulation for the Vidale- Wolfe 
Model 

Whereas Vidale and Wolfe offered their model primarily as a description 
of actual market phenomena represented by cases which they had ob- 
served, we obtain the optimal advertising expenditures for the model in 
order to maximize a certain objective function over the horizon T, while 
also attaining a terminal sales target; see Sethi (1973a). For this, it is 
convenient to transform (7.19) by making the change of variable 



X = 



M’ 



(7.20) 



Thus, X represents the market share (or more precisely, the rate of sales 
expressed as a fraction of the satmation level M). Furthermore, we 
define 



M M 



(7.21) 



Now we can rewrite (7.19) as 



X ~ ru(l — x) — 6x, (r(0) = x. 



(7.22) 



From now on we assume M, and hence 6 and r, to be positive con- 
stants. It would not be difficult to extend the analysis when M depends 
on t, but we do not carry it out here. In Exercise 7.12 you are asked to 
partially analyze the time-dependent case. 

To define the optimal control problem arising from the Vidale- Wolfe 
model, we let tt denote the maximum sales revenue corresponding to 
X ~ 1, with 7TX denoting the revenue function for x E [0, 1]. Also let 
Q be the maximum allowable rate of advertising expenditure and let p 
denote the continuous discount rate. With these definitions the optimal 
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control problem can be stated as follows: 




subject to 

X — ru(l ~ x) — Sx, x(0) = xq, 
the terminal state constraint 
x(T) = XT, 

and the control constraint 
0<u<Q. 



(7.23) 



Here Q can be finite or infinite and the target market share xt is in 
[0, 1]. Note that the problem is a fixed-end-point problem. It is obvious 
that the requirement 0 < a: < 1 holds without being imposed, where 
xq G [0, 1] is the initial market share. 

It is possible to solve this problem by an application of the maximum 
principle; see Exercise 7.13. However, we will use instead a method based 
on Green’s theorem which does not make use of the maximum principle. 
This method provides a convenient procedure for solving fixed-end-point 
problems having one state variable and one control variable, and where 
the control variable appears linearly in both the state equation and the 
objective frmction; see Miele (1962) and Sethi (1977b). Problem (7.23) 
has these properties, and therefore it is also a good example with which 
to illustrate the method. For the application of Green’s theorem we 
require that Q be large. In particular we can let Q = oo. 



7.2.2 Solution Using Green’s Theorem when Q is Large 



In this section we will solve the fixed-end-point problem starting with xq 
and ending with xt, under the assumption that Q is either unbounded 
or very large. The places where these assumptions are needed will be 
indicated. 

To make use of Green’s theorem, it is convenient to consider times r 
and 0, where 0 < r < ^ < T, and the problem: 



max 





(7.24) 
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subject to 

X — ru{l — x) -- Sx, x{r) = A, x{9) ~ (7.25) 

0<u<Q. (7.26) 

To change the objective function in (7.24) into a line integral along any 
feasible arc Pi from (r,A) to {0,B) in (t,rc)-space as shown in Figure 
7.4, we multiply (7.25) by dt and obtain the formal relation 

dx + dxdt 
udt T- ^ , 
r(l — X) 

which we substitute into the objective function (7.24). Thus, 




0 a b T 



Figure 7.4: Feasible Arcs in (i, a;)-Space 

Consider another feasible arc P 2 from (t. A) to (0,B) lying above Fi 
as shown in Figure 7.4. Let F = Fi — F 2 , where F is a simple closed 
cmve traversed in the counter-clockwise direction. That is, F goes along 
Fi in the direction of its arrow and along F 2 in the direction opposite to 
its arrow. We now have 



Jr — Jvi-r2 = - Jf2‘ 



(7.27) 




198 



7. Applications to Marketing 



Since P is a simple closed curve, we can use Green’s theorem to 
express Jp as an area integral over the region R enclosed by P. Thus, 
treating x and t as independent variables, we can write 



Jr 






7TX — 




6x 




r(l — x)_ 


C- Civ 


' ~e-P^ 


d 


r(l — x) 


dx 



r(l — x) 



-P^dx 



(ttx — 



6x 



- /Xh 



+ ■ 



— Trr 



r(l — x) 

^-pt 



)e 



-pt 



-dtdx. 



_(l - xY (1 — x) 

Denote the term in brackets of the integrand of (7.28) by 

^ P 



I(x) = 



+ ■ 



— TTT. 



' di/dx 
(7.28) 



(1 — xY (1 — x) 

Note that the sign of the integrand is the same as the sign of I{x). 



(7.29) 



Lemma 7.1 (Comparison Lemma). Let Pj and T 2 he the lower and 
upper feasible arcs as shown in Figure 7.4. If I (x) > 0 for all (x^t) E R, 
then the lower arc Pi is at least as profitable as the upper arc P 2 . Analo- 
gously, ifl{x) < 0 for all (x, t) E R, then P 2 is at least as profitable asTi. 

Proof. If i(x) > 0 for aU (x,t) E R, then Jr > 0 from (7.28) 
and (7.29). Hence from (7.27), Jpj > Jp 2 ’ The proof of the other 
statement is similar. □ 



To make use of this lemma to find the optimal control for the problem 
stated in (7.23), we need to find regions where I(x) is positive and where 
it is negative. For this, note first that I(x) is an increasing frmction of 
X in [0, 1]. Solving /(a;) = 0 will give that value of x, above which I{x) 
is positive and below which I(x) is negative. Since I{x) is quadratic in 
1/(1 — x), we can use the quadratic formula (see Exercise 7.15) to get 



— p ± V H- 47vrS 

To keep x in the interval [0, 1] , we must choose the positive sign before 
the radical. The optimal x must be nonnegative so we have 

x^ = max <1 =, 0 i 

( -p+ y/p^ F 47rr6 J 



(7.30) 
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where the superscript s is used because this will turn out to be a singular 
trajectory. Since is nonnegative, the control 

^ r(l — X®) 

corresponding to (7.30) will always be nonnegative. Also since Q is 
assumed to be large, will always be feasible. Moreover, in Exercise 
7.16, you will be asked to show that = 0 and = 0 if, and only if, 
nr < S + p. 

We now have enough machinery to obtain the optimal solution for 
(7.23) when Q is assumed to be sufficiently large, i.e., Q >u^, where 
is given in (7.31). We state these in the form of two theorems: Theorem 
7.1 refers to the case in which T is large; Theorem 7.2 refers to the case in 
which T is small. To define these concepts, let t\ be the shortest time to 
go from xo to x® and similarly let t 2 be the shortest time to go from x^ to 
XT- Then, we say T is large if T > + ^ 2 ; otherwise T is small. Figures 

7.5-7.S show cases for which T is large, while Figures 7.10-7.11 show 
cases for which T is small. In Exercise 7.18 you are asked to determine 
whether T is large or small in specific cases. In the statements of the 
theorems we will assume that xq and xx are such that xx is reachable 
from Xq. In Exercise 7.19 you are asked to find the reachable set for any 
given initial condition xq. 

In Figures 7. 5-7.8, the quantities t\ and ^2 are case dependent and 
not necessarily the same; see Exercise 7.17. 

Theorem 7.1 Let T be large and let xx be reachable from xq. For the 
Cases 1-4 of inequalities relating xq and xx to x®, the optimal trajecto- 
ries are given in Figures 7. 5-7.8, respectively. 

Proof. We give details for Case 1 only. The proofs for the other cases are 
similar. Figure 7.9 shows the optimal trajectory for Figure 7.5 together 
with an arbitrarily chosen feasible trajectory, shown dotted. It should 
be clear that the dotted trajectory cannot cross the arc xq to C, since 
u~ Q on that arc. Similarly the dotted trajectory cannot cross the arc 
G to XT’, because tt = 0 on that arc. 

We subdivide the interval [0,T] into subintervals over which the 
dotted arc is either above, below, or identical to the solid arc. In Figure 
7.9 these sub-intervals are [0, d], [d, e], [e, /], and [/, T]. Because I(x) is 
positive for x > x^ and 7(x) is negative for x < x^, the regions enclosed 
by the two trajectories have been marked with + or — sign depending 
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Figure 7.6: Optimal Trajectory for Case 2: xq < x® and x® < xt 
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X 




Figure 7.9: Optimal Trajectory (Solid Lines) 



on whether I{x) is positive or negative on the regions, respectively. 
By Lemma 7.1, the solid arc is better than the dotted arc in the 
subintervals [0, d], [d, e], and [/, T]; in interval [e, /], they have identical 
values. Hence the dotted trajectory is inferior to the solid trajectory. 
This proof can be extended to any (countable) number of crossings of 
the trajectories; see Sethi (1977b). □ 

Figures 7.5>7.8 are drawn for the situation that T > + ^ 2 - You are 

asked to consider the case T = ti + t 2 in Exercise 7.23. In the following 
theorem, the case T is dealt with. 

Theorem 7.2 Let T be small, i.e., T < t\ and let xt be reach- 

able from xq. For the two possible Cases 1 and 2 of inequalities relating 
xo to XT and x^, the optimal trajectories are given in Figures 7.10 and 
7.11, respectively. 

Proof. The requirement of feasibility when T is small rules out cases 
where xq and xt are on opposite sides of or equal to x^. The proofs of 
optimality of the trajectories shown in Figures 7.10 and 7.11 are similar 
to the proofs of the parts of Theorem 7.1, and are left as Exercise 7.23. 
In Figures 7.10 and 7.11, it is possible to have either >T or t 2 > T. 
Try sketching some of these special cases. □ 
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X 




Figure 7.10: Optimal Trajectory When T Is Small in Case 1: xq < 
and XT < 



X 




Figure 7.11: Optimal Trajectory When T Is Small in Case 2: xq > x^ 
and xt < x^ 
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All the previous discussion has assumed that Q was finite and suffi- 
ciently large, but we can easily extend this to the case when Q = oo. This 
possibility makes the arcs in Figures 7.5-7.10, corresponding to u* = Q, 
become vertical line segments corresponding to impulse controls. For 
example Figure 7.6 becomes Figure 7.12 when Q = oo. 

X 

4 





u‘"= 


Impulse 

Control 
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Impulse 
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0 T 

Figure 7,12: Optimal Trajectory for Case 2 of Theorem 7.1 for Q = oo 



The effect on the objective function of the impulse control 
imp(a:o,ic®) can be computed as follows: we integrate the state equa- 
tion in (7.23) from 0 to e with the initial condition Xq and u treated 
as constant. Assume xq < x^, since imp (xq, x^) = 0 in the trivial case 
when Xq — x^. The result is 



a;(£) 




TU \ 
8 A ruj 



-{S+ru)e 



+ 



ru 



8 ^ru‘ 



According to the procedure at the end of Section 1.4, we must, for u, 
choose u{£) so that x(s) is x®. It should be clear that u{e) — > oo as £ 0. 

Noting that the right-hand side of (1.23) is 



lim[— u(e)£] = imp(a:o, 0) — /, 

e -+0 

it is possible to solve for I directly without obtaining u(£). This is done 
by letting e 0, —u{£)£ — > 7, and x(s) = x^ in the expression for x(£) 
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obtained above. This gives 

x(0+) = e^^{xo - 1) + 1. 



Therefore, 



imp(xo, 0) = — In 
r 



I -xq ' 
1-x^ 



(7.32) 



We remark that this formula holds for any time t, as well as ^ — 0. Hence 
it can also be used at t = T to compute the impulse at the end of the 
period; see Figure 7.12 and Exercise 7.25. 



7.2.3 Solution When Q Is Small 

When Q is small, it is not possible to go along the turnpike x^, so that 
the arguments based on Green’s theorem become difficult to apply. We 
therefore return to the maximum principle approach to analyze the prob- 
lem. By “Q is small” we mean Q < u^, where is defined in (7.31). 
Another characterization of the phrase “Q is small” in terms of the prob- 
lem parameters is given in Exercise 7.27. 

We now apply the current- value maximum principle (3.41) to the 
fixed-end-point problem given in (7.23). We form the current- value 
Hamiltonian as 



H = TTx — u \\ru{l — x) — 8x\ 

= Txx — 6\x u[~l r\{l — x)].^ (7.33) 

and the Lagrangian function as 

L = H + (7.34) 

The adjoint variable A satisfies 

o r 

X = pX — — = pA + X{ru + 6) —TT, (7.35) 

ox 

where A(T) is a constant, as in Row 2 of Table 3.1, that must be deter- 
mined. Furthermore, the Lagrange multiplier p in (7.34) must satisfy 

Ai > 0, — 'n) = 0. (7.36) 

From (7.33) we notice that the Hamiltonian is linear in the control. So 
the optimal control is 



u*(t) = bang[0, Q;W{t)], 



(7.37) 
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where 

W{t) = rA(l - x) - 1. (7.38) 

We remark that the sufficiency conditions of Section 2.4, which require 
concavity of the derived Hamiltonian do not apply here; see Exer- 
cise 7.30. However, the sufficiency of the maximum principle for this 
kind of problem has been established in the literature; see, for example 
Lansdowne (1970). 

When W = rA(l — or) — 1 = 0, we have the possibility of singular 
control, provided we can maintain this equality over a finite time interval. 
For the case when Q is large, we showed in the previous section that the 
optimal trajectory contains a segment on which x = and u* = u^, 
where 0 < < Q. (See Exercise 7.27 for the condition that Q is small.) 

This can obviously be a singular control. Further discussion of singular 
control is given in Appendix D.3. 

A complete solution of problem (7.23) when Q is small requires a 
lengthy switching point analysis. The details are too voluminous to give 
here, but an interested reader can find the details in Sethi (1973a). 



7.2.4 Solution When T Is Infinite 

In the previous two sections we assumed that T was finite. We now 
formulate the infinite horizon version of (7.23): 




We divide the analysis of this problem into the same two cases defined 
as before, namely, “Q is large” and “Q is small”. 

When Q is large, the results of Theorem 7.1 suggest the solution 
when T is infinite. Because of the discoimt factor, the ending parts of 
the solutions shown in Figures 7. 5-7.8 can be shown to be irrelevant (i.e., 
the discounted profit accumulated during the interval (T — t 2 , T) goes to 
0 as T goes to oo). Therefore, we only have two cases: (a) xq < and 
(b) xq > x^. The optimal control in Case (a) is to use u* = Q in the 
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interval [0,ti) and u* = for t>t\. Similarly, the optimal control in 
Case (b) is to use u* = 0 in the interval [0, ti) and u* = for t>ti. 

An alternate way to see that the above solutions give u* = for 
t >ti is to check that they satisfy the turnpike conditions (3.73). To do 
this we need to find the values of state, control, and adjoint variables and 
the Lagrange multiplier along the turnpike. It can be easily shown that 
X = u = u^, = 7 t/(p + (^ + ru^)^ and = 0 satisfy the turnpike 

conditions (3.73). 

When Q is small, i.e., Q < u^, it is not possible to follow the turn- 
pike X = x^, because that would require u = which is not a feasible 
control. Intuitively, it seems clear that the ’’closest” stationary path 
which we can follow is the path obtained by setting x ~ 0 and u = Q, 
the largest possible control, in the state equation of (7.39). This gives 

rQ 

and correspondingly we obtain 



p-\-6 -\-rQ 

by setting u = Q and A = 0 in (7.35) and solving for A. More specifically, 
we state the following theorems which give the turnpike and optimal 
control when Q is small. To prove these theorems we need to define two 
more quantities, namely, 

f =: 1 - 1/rA, (7.42) 

ft — rX{l — x) — 1. (7.43) 





Theorem 7.3 When Q is small, the following quantities 

{x,Q,A,m} 



form a turnpike. 



(7.44) 



Proof. We show that the conditions in (3.73) hold for (7.44). The 
first two are obvious. By Exercise 7.28 we know x < x, which, from 
definitions (7.42) and (7.43), implies ft > 0. Furthermore u = Q, 
so (7.36) holds and the third condition of (3.73) also holds. Finally 
because W = ft from (7.38) and (7.43), it follows that W > 0, so the 
Hamiltonian maximizing condition of (3.73) holds with u — Q, and the 
proof is complete. □ 
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Theorem 7.4 When Q is small, the optimal control at time r is given 
by: 

(a) If x(r) < X, then u*{r) = Q. 

(b) If x{r) > X, then u*(r) = 0. 

Proof, (a) We set A(^) = A for all t > r and note that A satisfies the 
adjoint equation (7.35) and the transversality condition (3.70). 

By Exercise 7.28 and the assumption that x(r) < x, we know that 
x(t) < X for all t. The proof that (7.36) and (7.37) hold for all t > r 
relies on the fact that x{t) < x and on an argument similar to the proof 
of the previous theorem. 

Figure 7.13 shows the optimal trajectories for x(0) < x and two 
different starting values x(0), one above and the other below x. Note 
that in this figure we are always in Case (a) since x{r) < x for all r > 0. 



X 




Figure 7.13: Optimal Trajectories for a:(0) < x 



(b) Assmne a:(0) > x. For this case we wiU show that the optimal 
trajectory is as shown in Figure 7.14, which is obtained by applying 
u = 0 until X = X and u = Q thereafter. Using this policy we can find 
the time t\ at which x(ti) = x, by solving the state equation in (7.39) 
with w = 0. This gives 



h 




^( 0 ) 

X 



(7.45) 
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X 




Clearly for t > ti, the policy w = Q is optimal because Case (a) 
applies. We now consider the interval [0, ti). Let r be any time in this 
interval as shown in Figure 7.14, and let x(r) be the corresponding value 
of the state variable. Consider the following two-point boundary value 
problem in the interval 



X = —6x, x{r) given, 

A = (p -t- ^)A — 7T, X{ti) = A. 



(7.46) 



In Exercise 7.31 you are asked to show that the switching function 
W{t) defined in (7.38) is negative in the interval (r, ti) and W{ti) = 0. 
Therefore by (7.37), the policy u = 0 used in deriving (7.46) satisfies 
the maximum principle. This policy ’’joins” the optimal pohcy after ti 
because X(ti) ~ X. 

In this book the sufficiency of the transversality condition (3.70) was 
stated under the hypothesis that the derived Hamiltonian was concave; 
see Theorem 2.1. In the present example, this hypothesis does not hold. 
However, as mentioned in Section 7.2.3, for this simple bilinear problem 
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it can be shown that (3.70) is sufficient for optimality. Because of the 
technical nature of this issue we omit the details. □ 



EXERCISES FOR CHAPTER 7 

7.1 In equations (7.2) and (7.3), assume S(p, G) = 1000 — bp + 2G and 
c(S) = 6S. Substitute into (7.5) and solve for the optimal price p* 
in terms of G. 

7.2 Derive the optimal monopoly price formula in (7.6) from (7.5). 

7.3 Derive the optimal goodwill level shown in (7.11), 

7.4 Show that the only possible singular level of goodwill (which can 
be maintained over a finite time interval) is the value G obtained 
in (7.12). 

7.5 Show that the total cost of advertising required to go from Gq < G 
to G instantaneously (by an impulse) is G — Gq. 

[Hint: Integrate G = u — 6G, G(0) = Gq, from 0 to er and equate 
G = limG(£), where the hmit is taken as £ ^ 0 and w — >■ oo in such 
a way that ue — >■ cost = — imp((?o, G, 0). See also the derivation of 
(7.32)]. 

7.6 Assume the effect of the exogenous variable Z(t) is seasonal so that 
the goodwill G(t) = 2 sint. Assume 6 = 0.1. Sketch the graph 
of u{t) = 6G 4- G, similar to Figure 7,2, and identify intervals in 
which maintaining the singular level G(t) is infeasible. 

7.7 In the Nerlove- Arrow Model of Sections 7.1.1 and 7.1.2, assume 
S{p,A,Z) = ap~^G^Z'^ and c(S) = cS. Show that the optimal 
stationary pohcy gives u/pS = constant, i.e., that the optimal 
advertising level is a constant fraction of sales regardless of the 
value of Z. (Such policies are followed by many industries.) 

7.8* Re-solve the advertising model of (7.13) with its objective function 
replaced by 

max e~^^[K 2 (G) — 'y{u)]dt\ , 

«>o [Jo ) 

where 'y{u) represents an increasing convex advertising cost func- 
tion with 7 (u) > 0, 'y'{u) > 0, and 'y"{u) > 0 for w > 0. This is the 
model of Gould (1970). 
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7.9 Show that /i(A) in (7.17) is a nonnegative and increasing function 
for A < 1. 



7.10 Develop a system of equations for the model of Section 7.1.1 similar 
to those in (7.18) for the phase diagram analysis in {u, G)-space for 
the problem of (7.13); see Feichtinger and Hartl (1986). 

7.11 Show that the A shown in Figure 7.3 satisfies 0 < A < 1. 

7.12 In (7.22), assume r and 6 are positive, differentiable functions of 
time. Derive expressions similar to (7.28)-(7.31) in order to get the 
new turnpike values. 

7.13 For problem (7.23) with Trr > 6 + p and Q sufficiently large, derive 
the turnpike {x, fi. A} by using the maximum principle. Check your 
answers with the corresponding values derived by Green’s theorem. 
Show that when p = 0, x reduces to the golden path rule. (See 
Exercise 7.16 for the case when 7rr < S p.) 

7.14 Let be such that A < x^ < B In Figure 7.4, and assume that 
I(x) >0 for X > x^ and I{x) <0 for x < x^. Construct paths Fi 
and F 2 . 

[Hint: Use Lemma 7.1.J 

7.15 Solve the quadratic equation I(x) = 0, where I{x) is defined in 
(7.29), to obtain the expression for x shown as the first argument 
in (7.30). 

7.16 Show that both x^ in (7.30) and in (7.31) are 0 if, and only if, 

TTr < 6 + p. 



7.17 For the problem in (7.23), suppose xq and xt are given and define 
x^ as in (7.30). Let U be the shortest time to go from to x^, 
and t 2 be the shortest time to go from x^ to xt- 



(a) li Xq <x^ and x^ > Xt, show that 



h 



rQ ES 



In 



X ~ Xq' 


1 , 




x—x^ 




XT. 






where x = rQJ(rQ + <5); assume x > x^. 

(b) Using the form of the answers in (a), find t\ and t 2 when 
Xq > x^ and x^ < Xt < x. 
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7.18 For Exercise 7.17(a), write the condition that T is large, i.e., T > 
ti+t 2 , in terms of all the parameters of problem (7.23). 

7.19 For problem (7.23), find the reachable set for a given initial xq and 
horizon time T. 

7.20 (a) For problem (7.23), assume r = 0.2, 6 = 0.05, p = 0.1, Q = 

5, 7T = 2, Xq = 0.2 and xt = 0.3. Use Exercises 7.17(a) and 
7.18 to show that T > 10.623 is large and T < 10.623 is small. 
Sketch the optimal trajectories for T = 13 and T = 18. 

(b) Redo (a) when xt = 0.7. Show that both T = 13 and T = 8 
are large. 

7.21 Prove Theorem 7.1 for Case 3. 

7.22 Draw four figures for the case T = U +^2 corresponding to Figures 
7.5-7.8. 

7.23 Prove Theorem 7.2. 

7.24 Sketch one or two other possible curves for the case when T is 
small. 

7.25 Obtain the impulse function, imp(A, B,t), required to take the 
state from A up to 5 instantaneously at time t for the Vidale- 
Wolfe model in Section 7.2.2. 

7.26 (a) Re~solve Exercise 7.20(a) with Q — oo. Show T = 10.5 is no 

longer small. 

(b) Show that T > 0 is large for Exercise 7.20(b) when Q = oo. 
Find the optimal value of the objective function when T = 8. 

7.27 Show that Q is small if, and only if, 

TrrS ^ ^ 

{S + p + rQ)(6 + rQ) ^ ' 

7.28 (a) Show that x < x^ < x when Q is small, where x is defined in 

(7.42). 

(b) Show that x > x^ when Q is large. 



7.29 Derive (7.45). 
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7 . 30 * Show the derived Hamiltonian corresponding to (7.33) and 
(7.37) is not concave in x for any given A. 

7.31 Show that the switching fiinction defined in (7.38) is concave in t, 
and hence show that the policy in Figure 7.14 is optimal. 

7 . 32 * Write the equation satisfied by the turnpike level x for the model 

j m^|j = J e~^^{7TX ~ u‘^)dt 

* subject to 

X ~ ^^(l — x) — x(0) — Xo- 

Show that the turnpike reduces to the golden path when p = 0. 

7 . 33 * Extend the Nerlove- Arrow Model and its results by introducing 
the additional capital stock variable 



k = v~-fK, K{0) = Ko, 

where v is the research expenditure. Assume the new cost function 
to be C(S, K). Note that this model allows the firm to manipulate 
its cost function. See Dhrymes (1962). 

7 . 34 * Analyze an extension of a finite horizon Nerlove- Arrow Model sub- 
ject to a budget constraint. That is, introduce the following isoperi- 
metric constraint: 




Also assume 7 t(G) = ay/G where a > 0 is a constant. See Sethi 
(1977c). 

7 . 35 * Introduce a budget constraint in a different way into the Nerlove- 
Arrow model as follows. Let B{t) be the budget at time t, and let 
7 > 0 be a constant. Assume B satisfies 

B = e-P\-u + ')G), B{t)) = Bo 

and B{t) > 0 for all t. Solve only the infinite horizon model. See 
Sethi and Lee (1981). 
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7 . 36 * Maximize the present value of total sales in the Nerlove- Arrow 
model, i.e., 

m^ I’-C e-P^pS{p, G)dt 
subject to (7.1) and the isoperimetric profit constraint 

e~^^\pS(p, G) — C(S) —u]dt = TT. 

See Tsurumi and Tsiurumi (1971). 

7 . 37 * Consider (7.39) with the state equation replaced by 

X — ru(l ~ x) + px{l —r)— 6x, x(0) = 

where the constant p > 0 reflects word-of-mouth commimication 
between buyers (represented by x) and non-buyers (represented 
by (1 — x)) of the product. Assume Q is infinite for convenience. 
Obtain the turnpike for this problem. See Sethi (1974b). 

7 . 38 * Solve (7.23) with the state equation replaced by 

X = r log u — 6x, a;(0) = xq, 

but without a terminal state constraint. Solve both the finite and 
infinite horizon versions of the problem. See Sethi (1975). 

7 . 39 * The Ozga Model (Ozga, 1960; Gould, 1970): Suppose the infor- 
mation spreads by word of mouth rather than by an impersonal 
advertising medium, i.e., individuals who are already aware of the 
product inform the individuals who are not, at a certain rate, influ- 
enced by advertising expenditure. What we have now is the Ozga 
model 

X = ux(l —x) — 6x, a:(0) = a;o- 
The optimal control problem is to maximize 

TOO 

J = / [7r(ar) — W{u)]dt 

Jo 

subject to the Ozga model. Assume that 7v(x) is concave and W (u) 
is convex. See Sethi (1979c) for a Green’s theorem application to 
this problem. 
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7.40 Obtain the optimal long-run stationary equilibrium for the follow- 
ing modification of the model (7.23), due to Sethi (1983b): 



max / e — u^)dt 

Jo 

subject to 

X = — 6x^ xq G [0, 1], 



(7.47) 



n > 0. 



In particular, show that the turnpike triple {x, A, u) is given by 



r‘^X/2 + 6^ "" 



tAa/I — X 



(7.48) 



^ _ \/[{p + <5)^ + r^7r] ~{p + 6) 

rV2 

Show that the optimal value of the objective function is 



(7.49) 



J*{xo) = \Xo + 



(7.50) 




Chapter 8 

The Maximum Principle: 
Discrete Time 



For many purposes it is convenient to assume that time is represented by 
a discrete variable, fc = 0, 1, 2, T, rather than by a continuous variable 
t e [0, T]. This is particularly true when we wish to solve a large control 
theory problem by means of a computer. It is also desirable, even when 
solving small problems which have state or adjoint diflPerential equations 
whose solutions cannot be expressed in closed form, to formulate them as 
discrete problems, and let the computer solve them in a stepwise manner. 

We will see that the maximum principle, which is to be derived in this 
chapter, is not valid for the discrete-time problem in as wide a sense as 
for the continuous-time problem. In fact we will reduce it to a nonlinear 
programming problem and state necessary conditions for its solution 
by using the well-known Kuhn- Tucker theorem. In order to follow this 
procedure, we have to make some simplifying assumptions and hence 
will obtain only a restricted form of the discrete maximum principle. In 
Section 8.2.5 we state without proof a more general form of the discrete 
maximum principle. 



8.1 Nonlinear Programming Problems 

We begin by stating a general form of a nonlinear programming problem. 
Let y be an n-component column vector, a be an r-component colmnn 
vector, and b an 5-component column vector. Let h : E^, g : 

E'^ E'^, and w : E^ — > E^ be given functions. We assume functions g 

and -u; to be colmnn vectors with components r and 5, respectively. We 
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consider the nonlinear programming problem: 

max h(y) (^-1) 

subject to 

s(y) = a, ( 8 . 2 ) 

w(y) > b. (8.3) 

We will develop necessary conditions, called the Kuhn- Tucker conditions, 
which a solution y* to this problem must satisfy. We start with simpler 
problems and work up to the statement of these conditions for the gen- 
eral problem in a heuristic fashion. References are given for rigorous 
developments of these results. 

In this chapter, whenever we take derivatives of functions, we assume 
that those derivatives exist and are continuous. It would be also helpful 
to recall the notation developed in Section 1.4. 

8.1.1 Lagrange Multipliers 

Suppose we want to solve (8.1) without imposing constraints (8.2) or 
(8.3). The problem is now the classical unconstrained maximization 
problem of calculus, and the first-order necessary conditions for its solu- 
tion are 

hy = 0. (8.4) 

The points satisfying (8.4) are called critical points. Critical points which 
are maxima, minima, or saddle points are of interest in this book. Ad- 
ditional higher-order conditions required to determine whether a critical 
point is a maximum or a minimum are stated in Exercise 8.2. In an 
important case when the fimction h is concave, condition (8.4) is also 
sufficient for a global maximum of h. 

Suppose we want to solve (8.1) while imposing just the equality con- 
straints (8.2). The method of Lagrange multipliers permits us to obtain 
necessary conditions which a solution to the constrained maximization 
problem (8.1) and (8.2) must satisfy. We define the Lagrangian function 

Z,(j/,A) = %)+A[s(2/)-a], (8.5) 

where A is an r— component row vector. The necessary condition for y* 
to be a (maximmn) solution to (8.1) and (8.2) is that there exists an 
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r-component row vector A such that 

Ly = hy Xgy = 0, (^’^) 

Lx = g(y)-a = 0. (8.7) 

Note that (8.7) is simply a repetition of the constraints in (8.2). 

The system of n + r equations (8.6) and (8.7) has n + r unknowns. 
Since some or all of the equations are nonlinear, the solution method 
will, in general, involve nonlinear programming techniques, and may be 
difficult. In other cases, e.g., when h is linear and g is quadratic, it may 
only involve the solution of linear equations. Once a solution (y*. A*) 
is found satisfying the necessary conditions (8.6) and (8.7), the solution 
must stiU be checked to see whether it satisfies sufficient conditions for 
a global maximum. Such sufficient conditions wiU be stated in Section 
8.1.3. 

Suppose (y*, A*) is in fact a solution to the constrained problem (8.6) 
and (8.7). Note that y* depends on a and we can show this dependence 
by writing y* = y*(a). Now h* = h*(a) = h{y*{a)) is the optimum value 
of the objective fimction. The Lagrange multiphers satisfy the relation 
which has an important managerial interpretation 

K = -A*. (8.8) 

namely, A* is the negative of the imputed value or shadow price of having 
one rniit more of the resource a^. In Exercise 8.4 you are asked to provide 
a proof of (8.8). 

Example 8.1 Consider the problem: 

max {h(x^ y) = — — y^} 
subject to 
2x + y = 10. 

V 

Solution. We form the Lagrangian 

L(x, y, A) = (-x^ - y^) + A(2x + y - 10). 

The necessary conditions found by partial differentiation are 

Lx = ”2x + 2A = 0, 

Ly — — 2y + A = 0, 

L\ = 2x T y — 10 = 0. 
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Prom the first two equations we get 

\ = X = 2y. 

Solving this with the last equation yields the quantities 
X* ^4, y* = 2, A* =4, h* = -20, 

which can be seen to give a maximum value to h, since h is concave 
and the constraint set is convex. The interpretation of the Lagrange 
multiplier A* = 4 can be obtained to verify (8.8) by replacing the constant 
10 by 10 + e and expanding the objective function in a Taylor series; see 
Exercise 8.5. 

8.1.2 Inequality Constraints 

Now suppose we want to solve the inequahty-constrained problem defined 
by (8.1) and (8.3), without (8.2). The latter constraints will be appended 
to the problem in Section 8.1.3. 

As before we define the Lagrangian 

L(y, p) = h{y) + p[w{y) - h] . (8.9) 

The Kuhn- Tucker necessary conditions for this problem cannot be as 
easily derived as for the equality-constrained problem in the preceding 
section. We will write them first, and then give interpretations to make 
them plausible. The necessary condition for y* to be a solution of (8.1) 
and (8.3) is that there exists an ^-dimensional row vector p such that 

Ly = hy pWy = 0 , ( 8 . 10 ) 

w > b, (8.11) 

p{w — b) — 0. (8.12) 

Note that (8.10) is analogous to (8.6). Also (8.11) repeats the in- 
equality constraint (8.3) in the same way that (8.7) repeated the equal- 
ity constraint (8.2). However, the conditions in (8.12) are new and are 
particular to the inequality-constrained problem. We will see that they 
include some of the boundary points of the feasible set of points as well as 
unconstrained maximum solution points, as candidates for the solution 
to the maximimi problem. This is best brought out by examples. 
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Example 8.2 Solve the problem: 

r 

max {/i(o::) = 8x ~ 

* subject to 
a: > 2. 

Solution. We form the Lagrangian 

L{x, /i) = 8rr — + ^{x — 2). 

The necessary conditions (8.10)-(8.12) become 

Lx = 8 — 2x + (1 = 

x-2>Q, 

> 0 , — 2 ) = 0 . 

Observe that the constraint ^{x — 2) ~ 0 in (8.15) can be 
either = 0, or x = 2. We treat these two cases separately. 

Case 1: /i = 0 

From (8.13) we get x = 4, which also satisfies (8.14). Hence, 
this solution, which makes /i(4) = 16, is a possible candidate for the 
maximum solution. 

Case 2: a: = 2 

Here from (8.13) we get fj, = —4, which does not satisfy the inequality 
/i > 0 in (8.15). 

From these two cases we conclude that the optimum solution is a;* =4 
and h* = h{x*) = 16. 

Example 8.3 Solve the problem: 

max {h{x) = 8a: — a;^} 

^ subject to 



(8.13) 

(8.14) 

(8.15) 

phrased as: 



X > 6. 
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Solution. The Lagrangian is 

L{x, p) ~Sx — p{x — 6). 

The necessary conditions are 

Lx = S-2x-\-fi = 0, (8.16) 

x-6 > 0, (8.17) 

> 0, iii{x — 6) = 0. (8.18) 

Again the condition fi(x — 6) = 0 is an either-or relation which gives 
two cases. 



Case 1: /X = 0 



Prom (8.16) we obtain x = 4, which does not satisfy (8.17), so that 
this case is infeasible. 

Case 2; x = 6 

Obviously (8.17) holds. Prom (8.16) we get p = 4, so that (8.18) 
holds as well. The optimal solution is then 

X* = 6, h* = h{x*) = 12, 

since it is the only solution satisfying the necessary conditions. 

The examples above involve only one variable, and are relatively 
obvious. The next example, which is two-dimensional, will reveal more 
of the power and the difficulties of applying the Kuhn-Tucker conditions. 

Example 8.4 Pind the shortest distance between the point (2,2) and 
the upper half of the semicircle of radius one, whose center is at the 
origin. In order to simplify the calculation, we minimize /i, the square of 
the distance. Hence, the problem can be stated as the following nonlinear 
programming problem: 

max {-h{x,y) = -(x - 2)^ - (y - 2)^} 

subject to 

x‘^ +y^ < 1 , 

y>0. 
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The Lagrangian function for this problem is 



L = -{x - -{y- 2f + ^(1 -x‘^ - y‘^) + vy. (8.19) 

The necessary conditions are 

-2{x - 2) - 2fix = 0, (8.20) 

-2(y-2)-2tiy + u = Q, (8.21) 

1 - _ / > 0 , ( 8 . 22 ) 
j/ > 0, (8.23) 

/r > 0, /r(l -x^ - y“^) = 0, (8.24) 

> 0, z/t/ = 0. (8.25) 



From (8.24) we see that either = 0 or + 2 /^ = 1, i.e., we are on the 
boundary of the semicircle. If ^Li = 0, we see from (8.20) that x — 2. But 
X = 2 does not satisfy (8.22) for any t/, and hence we conclude /i > 0 
and x^ Py^ = 1. 

From (8.25) we conclude either i/ = 0 or 2 / = 0. If ly = 0, then 
from (8.20), (8.21), and /a > 0, we get x ~ y. Solving the latter with 
+ 2/^ = 1? gives 

(a) (v/2/2, v^/2) and h = (9 - 4 a/2)/4. 

If 2 / = 0, then solving with + 2 /^ = 1 gives 

(b) (1,0) and h = 5, 

(c) (-1,0) and 13. 

These three points are shown in Figure 8.1. Of the three points found 
that satisfy the necessary conditions, clearly the point (a/2/2, \/2/2) 
foimd in (a) is the nearest point and solves the closest-point problem. 
The point (—1,0) in (c) is in fact the farthest point; and the point (1,0) 
in (b) is neither the closest nor the farthest point. 

The fact that there are three points satisfying the necessary condi- 
tions, and only one of them actually solves the problem at hand, empha- 
sizes that the conditions are only necessary and not sufficient. In every 
case it is important to check the solutions to the necessary conditions to 
see which of the solutions provides the optimum. 

Next we work two examples that show some technical difficulties that 
can arise in the apphcation of the Kuhn-Tucker conditions. 
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y 




Figure 8.1: Shortest Distance from a Point to a Semi-Circle 



Example 8.5 Consider the problem: 

max {h(x, y) = y} 



subject to 



+ l > 0, 

2 / > 0 . 



(8.26) 

(8.27) 

(8.28) 



The set of points satisfying the constraints is shown shaded in Figure 
8.2. From the figure it is obvious that the solution point (0,1) maximizes 
the value of y. Let us see if we can find it using the above procedure. 



3 ^ 




Figure 8.2: Graph of Example 8.5 



The Lagrangian is 

L = y + -y + l)->ttiy. 



(8.29) 
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The necessary conditions are 



L^ = -|ax-'/* = 0, (8.30) 

Ly = l-X + fj. = 0, (8.31) 

A > 0, A(-a:^/^ - y + 1) = 0, (8.32) 

/A > 0, fry = 0, (8.33) 



together with (8.27) and (8.28). Prom (8.30) we get A = 0, since is 
never 0 in the range — 1 < x < 1. But substitution of A = 0 into (8.31) 
gives /i = — 1 < 0, which fails to solve (8.33). 

You may think perhaps that the reason for the failure of the method 
is due to the non-differentiability of constraint (8.27) when x = 0. That 
is part of the reason, but the next example shows that something deeper 
is involved. 



Example 8.6 Consider the problem: 



max {h(x, y) = y} 

subject to 

(1 - yf -x^ > 0, 

y > 0. 



(8.34) 



(8.35) 

(8.36) 



The constraints are now differentiable and the set of feasible solutions is 
exactly the same as for Example 8.5, and is shown in Figure 8.2. Hence, 
the optimum solution is (x*, y*) = (0, 1) and h* = 1. But once again the 
Kuhn- Tucker method fails, as we will see. The Lagrangian is 



^ = y + A[(l - yf - + yy, 


(8.37) 


so that the necessary conditions are 


Lx = — 2xA = 0, 


(8.38) 


Ly = l-3A(l-y)+M = 0, 


(8.39) 


A > 0, A[(l-2/)»-rr2] = 0, 


(8.40) 


y, > 0, yy = 0, 


(8.41) 



together with (8.35) and (8.36). Prom (8.41) we get, either y = 0 or 
/Li = 0. Since y = 0 minimizes the objective fimction, we choose y = 0. 
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Prom (8.38) we get either A = 0 or x = 0. Since substitution of A = = 0 

into (8.39) shows that it is unsolvable, we choose a; = 0, A ^ 0. But then 
(8.40) gives (1 — y)^ ~ 0 or y = 1. However, p = 0 and y = 1 means that 
once more there is no solution to (8.39). 

The reason for failure of the method in Example 8.6 is that the con- 
straints do not satisfy what is called the constraint qualification, which 
will be discussed in the next section. 

Example 8.6 shows the necessity of imposing some kind of condition 
to rule out cusp points such as the (0, 1) point in Figure 8.2, since the 
method shown will fail to find the solution when the answer occurs at 
such points. A brief mathematical description of cusp points is given 
shortly hereafter in this section. A complete study of the problem is 
beyond the scope of this book, but we state here a version of the con- 
straint qualification sufficient for om purposes. For further information, 
see Mangasarian (1969). 

In order to motivate the definition we illustrate two different situ- 
ations in Figure 8.3. In Figure 8.3(a) we show two boundary curves 
and W 2 {y) = 1)2 intersecting the botmdary point y. The two 
tangents to these curves are shown, and v is a vector lying between the 
two tangents. Starting at y, there is a differentiable curve c(r), 0 < r < 1, 
drawn so that it hes entirely within the feasible set Y, such that its ini- 
tial slope is equal to v. Whenever such a curve can be drawn from every 
boundary point y inY and every v contained between the tangent lines. 




a) Constraint Qualification Constraint Qualification 

Satisfied Satisfied 



Figure 8.3: Kuhn-Tucker Constraint Quahfication 
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we say that the constraints defining Y satisfy the Kuhn- Tucker constraint 
qualification. 

Figure 8.3(b) illustrates a case of a cusp at which the constraint 
qualification does not hold. Here the two tangents to the graphs wi{y) = 
bi and W 2 (y) = &2 coincide, so that vi and V 2 are vectors lying between 
these two tangents. Notice that for vector vi it is possible to find the 
differentiable curve c(r) satisfying the above conditions, but for vector 
V 2 no such curve exists. Hence, the constraint qualification does not hold 
for the example in Figure 8.3(b). 



8.1.3 Theorems from Nonlinear Programming 

We now state without proof two nonlinear programming theorems which 
we will use in deriving our version of the discrete maximum principle. 
For proofs, see Mangasarian (1969). 

We first state the constraint qualification symbohcally. For the prob- 
lem defined by (8.1), (8.2), and (8.3), let Y be the set of all feasible 
vectors satisfying (8.2) and (8.3), i.e., 

y = {y\g(y) = My) > H • 



Let y be any point of Y and let z(y) ~ d he the vector of tight con- 
straints at point y, i.e., ^ includes aU the g constraints in (8.2) and those 
constraints in (8.3) which are satisfied as equalities. 

Define the set 




9z{y) 

dy 



<o}. 



(8.42) 



Then, we shall say that the constraint set Y satisfies the Kuhn-Tucker 
constraint qualification at ^ G T if 2 is differentiable at y and if, for every 
V G there exists a differentiable curve c{r) defined for 0 < r < 1 such 
that 



(i) c(0) = y, 

(ii) c(r) G Y for all t satisfying 0 < r < 1, 
dc(r^ 

(iii) — - — = kv for some constant k > 0. 

dr 

The interpretation of this condition was given for the two dimensional 
case in the preceding section. 

We now state two theorems for giving necessary and sufficient condi- 
tions for the problem given by (8.1)-(8.3). The Lagrangian function for 
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this problem is 

L{y, A, /i) = /i + X{g(y) - a) + p{w(y) - b). (8.43) 

The Kuhn- Tucker conditions at y eY for this problem are 

^y{y^ \ Jj) = ^y{y) + Xgy{y) + JiWy{y) = 0, (8.44) 

g{y) = a, (8.45) 

w{y) > b, (8.46) 

p>0, JXw{y) = 0, (8.47) 

where A and p are row vectors of multipliers to be determined. 

Theorem 8.1 (Sufficient Conditions). Ifh,g, andw are differentiable, 
h is concave, g is affine, w is concave, and {y, A, p) solve the conditions 
(8.44)- (8.47), then y is a solution to the maximization problem (8.1)- 
(8.3), 

Theorem 8.2 (Necessary Conditions). Ifh,g, andw are differentiable, 
y solves the maximization problem, and the constraint qualification holds 
at y, then there exist multipliers A and p such that {y, A, p) satisfy con- 
ditions (8.44)-(8.47). 

8.2 A Discrete Maximum Principle 

We shall now use the nonlinear programming results of the previous 
section to derive a special form of the discrete maximum principle. Two 
references in this connection are Luenberger (1972) and Mangasarian 
and Fromovitz (1967). A more general discrete maximum principle will 
be stated in Section 8.3. 

8.2.1 A Discrete-Time Optimal Control Problem 

In order to state a discrete-time optimal control problem over the periods 
0, 1, 2, ..., T, we define the following: 

© = the set {0, 1, 2, ..., T — 1}, 

= an TT^component column state vector; fc = 0, 1, ..., T, 
w* = an 771-component column control vector; A: = 0, 1, 2, ..., T — 1, 
b^ — an 5-component colmnn vector of constants; k = 0,1, ...,T — 1. 
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Here, the state is assumed to be measured at the beginning of 
period k and control is implemented during period k. This convention 
is depicted in Figure 8.4. 




Figure 8.4: Discrete-Time Conventions 

We also define continuously differentiable functions / : E'^ x E'^ x 
0 -> F;”, F : E^ X E^ xS ^ E\ g : E^ x & ^ E^, and S : 

E^ X eu{T}^E^. 

Then, a discrete-time optimal control problem in the Bolza form (see 
Section 2.1.4) is: 

max k) 4- 5(o;^,T)| (8.48) 

I fc=0 J 

subject to the difference equation 

Ax’^ = 2:^+1 - X* = /(a:^, F), fc-0,...,T-l, 

x^ given, (8.49) 

g(u^,k) > (8.50) 

In (8.49) the term Ax* — x*+^ — x* is known as the difference operator. 

8.2.2 A Discrete Maximum Principle 

We now apply the nonlinear programming theory of Section 8.1 to find 
necessary conditions for the solution to the Mayer form of the control 
problem of Section 8.2.1. 

We let be an TT^component row vector of Lagrange multipliers, 
which we rename adjoint variables and which we associate with equation 
(8.49). Similarly, we let be an 5-component row vector of Lagrange 
multipliers associated with constraint (8.50). These multipliers are de- 
fined for each time A: = 0, ..., T — 1. 
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The Lagrangian function of the problem is 



k=0 k=0 

+ J2^^'Mn'‘,k)-b% (8.51) 



We now define the Hamiltonian function to be 



= H{x\ k) = F{x\u\ k) + A^+ V(^ , w , k). 
Using (8.52) we can rewrite (8.51) as 

T-l 

L = S(x'^,T) + 



(8.52) 






(8.53) 



If we differentiate (8.53) with respect to x^ for k = 1,2, ...,T — 1, we 
obtain 

dL dH^ . .,1 ^ 

— ~^~k — — 0, 

ox^ ox^ 

which upon rearranging terms becomes 



AA* = A^+^-A* = — /fc = 0,l,...,r-l. (8.54) 

ox'^ 



If we differentiate (8.53) with respect to we get 



dL _ dS 
dx'^ dx^ 



A^ = 0, or A^ = 



(8.55) 



The difference equations (8.54) with terminal boundary conditions (8.55) 
are called the adjoint equations. 

If we differentiate L with respect to and state the corresponding 
Kuhn-Tucker conditions for the multiplier and constraint (8.50), we 
have 

du^ du^ du^ 



9H^ = 

0M* ’ 



(8.56) 
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and 

/ > 0, p^[g(u^ k) - b^] = 0. (8.57) 

We note that, provided is concave in u^, g{u^^ k) is concave in and 
the constraint qualification holds, then conditions (8.56) and (8.57) are 
precisely the necessary and sufficient conditions for solving the following 
Hamiltonian maximization problem: 

max if* 

‘ subject to (8.58) 

V 

We have thus derived the following restricted form of the discrete maxi- 
mum principle. 

Theorem 8.3 . If for every fc, in (8.52) and g{u^, k) are concave in 

, and the constraint qualification holds, then the necessary conditions 
foru^*, fc = 0, 1, ...,T — 1, to be an optimal control for the problem (8.48)- 
(8.50) are 

Ax’‘* = f{x’‘*,u’=%k), 1 “ given, 

AA* = A^= 

< 

for all ii* such that g{u'‘, k) >6*, k = 0, 1 , T —1. 

(8.59) 

Section 8.2.3 gives examples of the application of this maximum prin- 
ciple (8.59). In Section 8.3 we state a more general discrete maximum 
principle. 

8.2.3 Examples 

Om first example will be similar to Example 2.3 and it will be solved 
completely. The reader will note that the solutions of the continuous 
and discrete problems are very similar. The second example is a discrete 
version of the production-inventory problem of Section 6.1. 
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Example 8.7 Consider the discrete-time optimal control problem: 
subject to 

Ax'= = u^a:® = 5, (8.61) 

u'‘ €Q= [-1, 1). (8.62) 

We shall solve this problem for T = 6 and T >7. 



Solution. The Hamiltonian is 

(8.63) 



from which it is obvious that the optimal policy is bang-bang. Its form 
is 



1 



if > 0, 



n^ = bang[-l,l;A^+i] = { 



singular 



if A*^+i = 0, 



(8.64) 



-1 if A*+i < 0. 



Let us assume, as we did in Example 2.3, that A* < 0 as long as 
is positive so that = —1. Given this assumption, (8.61) becomes 
Ai* = —1, whose solution is 



i'= = -fc + 5for fc = l,2,...,r-l. (8.65) 

By differentiating (8.63), we obtain the adjoint equation 

AA* = = I*, A^ = 0. (8.66) 

ox^ 

Let us assume T = 6. Substitute (8.65) into (8.66) to obtain 

AA* = -fc + 5, A® = 0. 



Prom Appendix A. 11, we find the solution to be 

A* = -ifc2 + Hfc + c, 
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where c is a constant. Since A® = 0, we can obtain the value of c by 
setting fc = 6 in the above equation. Thus, 

A® = -hse) + ^(6) + C = 0 =► c = -15, 



so that 

A*’ = + ^k- 15. (8.67) 

A sketch of the values for and appears in Figure 8.5. Note that 
A^ = 0, so that the control is singular. However, since = 1 we 
choose — 1 in order to bring down to 0. 




6 ^ ^ 



Figure 8.5: Sketch of x^ and A^ 

The solution of the problem for T > 7 is carried out in the same 
way that we solved Example 2.3. Namely, observe that = 0 and 
A^ = A^ = 0, so that the control is singular. We simply make A^ — 0 for 
k >7 so that = 0 for all A: > 7. It is clear without a formal proof that 
this maximizes (8.60). 
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Example 8.8 Let us consider a discrete version of the production- 
inventory example of Section 6.1; see Kleindorfer, Kriebel, Thompson, 
and Kleindorfer (1975). Let and be the inventory, production, 

and demand at time k, respectively. Let 7° be the initial inventory, let 
I and P be the goal levels of inventory and production, and let h and c 
be inventory and production cost coefficients. The problem is: 

-i - i)2 + c{I* - P)^] I (8.68) 




subject to 

A/* = given. (8.69) 



Form the Hamiltonian 

- i)^ + c(P*^ - Pf] + A'^+^(P* - 5*^), (8.70) 

where the adjoint variable satisfies 

r) 

AA^ = = Hi’’ - I). = 0- (8-71) 

To maximize the Hamiltonian, let us differentiate (8.70) to obtain 



dH^ 



~c{P^ _ p) + = 0. 



Since production must be nonnegative, we obtain the optimal production 



as 

P^* = max [0, P + A^+Vc). (8.72) 

Expressions (8.69), (8.71), and (8.72) determine a two-point bound- 
ary value problem. For a given set of data, it can be solved numerically 
by using a spreadsheet software like EXCEL; see Section 2.5. If the con- 
straint P* > 0 is dropped it can be solved analytically by the method of 
Section 6.1, with difference equations replacing the differential equations 
used there. 



8.3 A General Discrete Maximum Principle 

For the maximum principle (8.59) we assumed that and g were con- 
cave in Uk so that the set of admissible controls was convex. These 
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are fairly strong assumptions which will now be relaxed and a general 
maximum principle stated. The proof can be found in Canon, Cullum 
and Polak (1970). Other references on discrete maximum principles are 
Halkin (1966) and Holtzman (1966). The problem to be solved is: 

{ T-l 

J = ^ F(x^, k) 

k=^0 

subject to 

Ax^ = f{x^,u^,k)^ X® given 

A: = 0,1,..., (T-l). 

Assumptions required are: 

(i) F(x^,u^, k) and k) are continuously differentiable in x^ 

for every and k. 

(ii) The sets {— F(cc, k), /(x, $7*, k)} are b- directionally convex for 
every x and k, where b= (— 1, 0, ..., 0). That is, given v and w in and 
0 < A < 1 , there exists u(X) G such that 

F{x, n(A), k) > AF(x, n, fc) -1- (1 — X)F{x, w, k) 



(8.73) 



(8.74) 



and 

f(x, u(A), k) ~ Xf{x, V, fc) + (1 - X)f{x, w, k) 

for every x and k. It should be noted that convexity imphes 6-directional 
convexity, but not the converse. 

(iii) satisfies the Kuhn-Tucker constraint qualification. 

With these assumptions replacing the assmnptions of Theorem 8.3, 
the maximum principle (8.59), with S{x'^ ^T) := 0, holds with control 
constraint set g{u^^ k) > 6* replaced by G Cl. When S{x'^,T) is not a 
zero function, the objective function in (8.73) is replaced by the Bolza 
form objective function (8.48). In Exercise 8.17, you are asked to con- 
vert the problem defined by (8.48) and (8.74) to its Lagrange form, and 
then obtain the corresponding assmnptions on the salvage value frmction 
S{x'^^T) for the results of this section to apply. For a fixed-end-point 
problem, i.e., when x'^ is also given in (8.74), the more general maximum 
principle holds with A^ a constant to be determined. Exercise 8.14 is 
an example of a fixed-end-point problem. Finally, when there are lags 
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in the system dynamics, i.e., when the state of the system in a period 
depends not only on the state and the control in the previous period, but 
also on the values of these variables in prior periods, it is easy to adapt 
the discrete maximum principle to deal with such systems; see Burdet 
and Sethi (1976). Exercise 8.18 presents an advertising model containing 
lags in its sales- advertising dynamics. 

Some concluding remarks on the apphcations of the discrete-time op- 
timal control problems are appropriate. Examples of real-life problems 
that can be modeled as such problems include the following: payments 
of principal and interest on loans; harvesting of crops; production plan- 
ning for monthly demands; etc. Such problems would require efficient 
computational procedures for their solution. Some references dealing 
with computational methods for discrete optimal control problems are 
Murray and Yakowitz (1984), Dunn and Bertsekas (1989), Pantoja and 
Mayne (1991), Wright (1993), and Dohrmann and Robinett (1999). An- 
other reason which makes the discrete optimal control theory important 
arises from the fact that digital computers are being used increasingly 
in the control of dynamic systems. 

Finally, Pepyne and Cassandras (1999) have recently explored an op- 
timal control approach to treat discrete event dynamic systems (DEDS). 
They also apply the approach to a transportation problem, modeled as 
a polling system. 

EXERCISES FOR CHAPTER 8 

8.1 Determine the critical points of the following functions: 

(a) h = — + 10^ + 6z + 27 

(b) h = hy^ — yz + ~ lt)y — 18z + 17. 

8.2 Let h be twice differentiable with its Hessian matrix defined to be 
H — hyy. Let y^ be a critical point, i.e., a solution of hy = 0. 
Let Hj be the jith principal minor, i.e., the j x j submatrix foimd 
in the first j rows and the first j columns of H. Let \Hj\ be the 
determinant of Hj. Then, y^ is a local maximmn of h if 

Hi < 0, \H 2 \ > 0, |i?3| < 0, ..., i-ir\Hn\ = i-ir\H\ > 0 

evaluated at y^] and y^ is a local minimum of h if 

Hi > 0, \H2\ > 0, \Hs\ > 0, ..., \Hn\ - |FT| > 0 

evaluated at y^. Apply these conditions to Exercise 8.1 to identify 
local minima and maxima of the functions in (a) and (b). 
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8,3 (a) During times of an energy crisis, it is important to economize 

on fuel consumption. Assiune that when traveling x mile/hour in 
high gear, a truck bums fuel at the rate of 



1 



[2500 

h X 

X 



gallons/ mile. 



If fuel costs 50 cents per gallon, find the speed that will minimize 
the cost of fuel for a 1000 mile trip. Check the second-order con- 
ditions. 

(b) When the government imposed this optimal speed in 1974, 
truck drivers became so angry they staged blockades on several 
freeways around the country. To explain the reason for these block- 
ades, we find that a crucial figure is the hourly wage rate of the 
truckers, a reahstic estimate (at that time) being $3.90 an hour. 
Recompute a speed that will minimize the total cost of the same 
trip, this time inclusive of the driver’s wages. No check needs to 
be made for the second-order condition. 



8.4 Use (8.5)-(8.7) to derive equation (8.8). 

8.5 Verify equation (8.8) in Example 8.1 by determining h*{a) and 
expanding the function h*(10 + e) in a Taylor series around the 
value 10. 

8.6 Maximize h(x) = (l/3)rr^ — 6x^ -f 32x -f- 5 subject to each of the 
following constraints: 

(a) a: < 6 

(b) X < 20. 

8.7 Rework Example 8.4 with each of the following points: 

(a) (0,-1) 

(b) (1/2, 1/2). 

8.8 Add the equality constraint 2x = y to the problem in Example 8.4 
and solve it. 
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8.9 Solve the problem: 



max h{x^y) 
subject to 

y>o, 

V 

for 

(a) h{x,y) = X Py. 

(b) h{x, y) = x + 2y. 

(c) h{x, y) = x-\- 3y. 

Comment on the solution in each of the cases (a), (b), and (c). 

8.10 Rewrite the maximum principle (8.59) for the special case of the 
linear Mayer form problem obtained when F = 0 and S(x^,T) = 
cx'^, where c is an Tvcomponent row vector of constants. 

8.11 Show that the necessary conditions for to be an optimal solution 
for (8.58) are given by (8.56) and (8.57). 

8.12 Prove Theorem 8.3. 

8.13 Formulate and solve a discrete- time version of the cash balance 
model of Section 5.1.1. 

8.14 Minimum Fuel Problem. Consider the problem: 

min{j = 

subject to 

l\x^ = Ax^ + x^ and x^ given 



where ^ is a given matrix. Obtain the expression for the adjoint 
variable and the form of the optimal control. 
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8.15 Current- Value Formulation. Obtain the current-value formulation 
of the discrete maximum principle. Assume that r is the discormt 
rate, i.e., 1/(1 -|- r) is the discormt factor. 

8.16 Convert the Bolza form problem (8.48)- (8. 50) to the equivalent 
linear Mayer form; see Section 2.1.4 for a similar conversion in the 
continuous-time case. 

8.17 Convert the problem defined by (8.48) and (8.74) to its Lagrange 
form. Then, obtain the assumptions on the salvage value func- 
tion S(x'^, T) so that the results of Section 8.3 apply. Under these 
assumptions, state the maximum principle for the Bolza form prob- 
lem defined by (8.48) and (8.74). 

8 . 18 * An Advertising Model (Burdet and Sethi, 1976). Let denote 
the sale and A; = 1, 2, ..., T— 1, denote the amount of advertising 
in period k. Formulate the sales- advertising dynamics as 

i 

Ax^ = —6x^ +r'^fj.{x\u^),x^ given, 

1=0 

where 6 and r are decay and response constants, respectively, and 
f\.{x\v!‘) is a nonnegative function that decreases with x^ and in- 
creases with uK In the special case when 

fk(^‘,u‘) =^Wyl‘k > 0. 

obtain optimal advertising amoimts to maximize the total dis- 
coimted profit given by 



k=l 

where, as in Section 7.2.1, tt denotes per unit sales revenue and p 
denotes the discoimt rate, and where 0 < tt* < represent the 
restriction on the advertising amount tt*. For a continuous-time 
version of problems with lags, see Hartl and Sethi (1984b). 
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Maintenance and 
Replacement 



The problem of determining the lifetime of an asset or an activity si- 
multaneously with its management during that lifetime is an important 
problem in practice. The most typical example is the problem of opti- 
mal maintenance and replacement of a machine; see Rapp (1974) and 
Pierskalla and Voelker (1976), Other examples occur in forest manage- 
ment such as in Naslimd (1969), Clark (1976), and Heaps (1984), and in 
advertising copy management as in Pekelman and Sethi (1978). 

The first major work dealing with machine replacement problems ap- 
peared in 1949 as a MAPI (Machinery and Applied Products Institute) 
study by Terborgh (1949). For the most part, this study was confined to 
those problems where the optimization was carried out only with respect 
to the replacement lives of the machines imder consideration. Boiteux 
(1955) and Masse (1962) extended the single machine replacement prob- 
lem to include the optimal timing of a partial replacement of the machine 
before its actual retirement, Naslund (1966) was the first to solve a gen- 
eralized version of the Boiteux problem by using the maximum principle. 
He considered optimal preventive maintenance applied continuously over 
the entire period instead of a single optimal partial replacement before 
the machine is retired. Thompson (1968) presented a modification of 
Naslund’s model which is described in the following section. 
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9.1 A Simple Maintenance and Replacement 
Model 

Consider a single machine whose resale value gradually declines over 
time. Its output is assumed to be proportional to its resale value. By 
applying preventive maintenance, it is possible to slow down the rate of 
decline of the resale value. The control problem consists of simultane- 
ously determining the optimal rate of preventive maintenance and the 
sale date of the machine. Clearly this is an optimal control problem with 
unspecified terminal time; see Section 3.1 and Example 3.5. 

9.1.1 The Model 

In order to define Thompson’s model, we use the following notation: 

T = the sale date of the machine to be determined, 
p = the constant discoimt rate, 

x{t) = the resale value of machine in dollars at time t; let 
a:r(0) = ccq, 

u{t) = the preventive maintenance rate at time t (mainte- 
nance here means money spend over and above the 
minimum required for necessary repairs), 
g{t) — the maintenance effectiveness function at time t (mea- 
sured in dollars added to the resale value per dollar 
spent on preventive maintenance), 
d(t) = the obsolescence function at time t (measmed in terms 
of dollars subtracted from x at time t), 

7T = the constant production rate in dollars per unit time 
per unit resale value; assume tt > p or else it does not 
pay to produce. 

It is assumed that g{t) is a nonincreasing function of time and d(t) 
is a nondecreasing function of time, and that for aU t 

u(t) en = [0,1/], (9.1) 

where is a positive constant. 

The present value of the machine is the sum of two terms, the dis- 
counted income (production minus maintenance) stream during its life 
plus the discounted resale value at T: 

rT 

J= [ttxU) — u{t)]e~^^dt + x{T)e~^'^. 

Jo 



(9.2) 
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The state variable x is affected by the obsolescence factor, the amount 
of preventive maintenance, and the maintenance effectiveness function. 
Thus, 

x{t) = —d{t) + g{t)u{t), a:(0) = xq. (9.3) 

In the interests of realism we assume that 

-d{t) + g{t)U < 0, t > 0. (9.4) 

The assumption implies that preventive maintenance is not so effective 
as to enhance the resale value of the machine over its previous values; 
rather it can at most slow down the decline of the resale value, even 
when preventive maintenance is performed at the maximmn rate U . A 
modification of (9.3) is given in Arora and Lele (1970). See also Hartl 
(1983b). 

The optimal control problem is to maximize (9.2) subject to (9.1) 
and (9.3). 

9.1.2 Solution by the Maximum Principle 

This problem is similar to Model Type (a) of Table 3.3 with the free- 
end-point condition as in Row 1 of Table 3.1. Therefore, we follow the 
steps for solution by the maximum principle stated for it in Chapter 3. 
The standard Hamiltonian as formulated in Section 2.2 is 

H = {'Kx — u)e~^^ -\- \{—d gu)^ (9.5) 

where the adjoint variable A satisfies 

\ = A(r) = e-^^. (9.6) 

Since T is unspecified, the additional terminal condition of (3.14) be- 
comes 

-pe-P^x{T) = -if, (9.7) 

which must hold on the optimal path at time T. 

The adjoint variable A can be easily obtained by integrating (9.6), 

i.e., 

= + (9.8) 

P 

The interpretation of A(i) is as follows. It gives in present value terms, 
the marginal profit per dollar of gain in resale value at time t. The first 
term represents the present value of one dollar of additional salvage value 
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at T brought about by one dollar of additional resale value at the current 
time t. The second term represents the present value of incremental 
production from t to T brought about by the extra productivity of the 
machine due to the additional one dollar of resale value at time t. 

Since the Hamiltonian is hnear in the control variable w, the optimal 
control for a problem with any fixed T is bang-bang as in Model Type 
(a) in Table 3.3. Thus, 



u*{t) = bang 



0, U; e-f'^g + -(e"'’* - - e"'’* 

P 



(9.9) 



To interpret this optimal policy, we see that the term 

is the present value of the marginal return from increasing the preventive 
maintenance by one dollar at time t. The last term in the argument 
of the bang function is the present value of that one dollar spent for 
preventive maintenance at time t. Thus, in words, the optimal policy 
means the following: If the marginal return of one dollar of additional 
preventive maintenance is more than one dollar, then perform the max- 
imum possible preventive maintenance, otherwise do not carry out any 
at aU. 

To find how the optimal control switches, we need to examine the 
switching function in (9.9). Rewriting it as 






. P P 



l)eP(*-T)^_ j 



(9.10) 



and taking the derivative of the bracketed terms with respect to t, we 
can conclude that the expression inside the square brackets in (9.10) is 
monotonically decreasing with time t on accoimt of the assumptions that 
ttIp > 1 and that g is nonincreasing with t. It follows that there will 
not be any singular control for any finite interval of time. Furthermore, 
since > 0 for all t, we can conclude that the switching fimction can 
only go from positive to negative and not vice versa. Thus, the optimal 
control wiU be either I/, or zero, or U followed by zero. The switching 
time is obtained as follows: equate (9.10) to zero and solve for t. If 
the solution is negative, let = 0, and if the solution is greater than 
T, let = T, otherwise set equal to the solution. It is clear that the 
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optimal control in (9.9) can now be rewritten as 



1 U 

0 t>t\ 



(9.11) 



Note that all the above calculations were made on the assumption 
that T was fixed, i.e., without imposing condition (9.7). On an optimal 
path, this condition, which uses (9. 5), (9. 7), and (9.8), can be restated as 






(9.12) 



This means that when u*(T*) = 0 < T*), we have 

x\T*) = ^212, ( 9 . 13 ) 

7T — p 

and when u*{T*) = U (i.e., = T*), we have 

7T — p 

Since d(t) is nondecreasing, g{t) is nonincreasing, and x{t) is nonin- 
creasing, equation (9,13) or equation (9.14), whichever the case may be, 
has a solution for T*. 



9.1.3 A Numerical Example 

It is instructive to work an example of this model in which specific values 
are assumed for the various functions. Other examples that illustrate 
other kinds of qualitatively different behavior are left as Exercises 9.2 - 

9.4. 

Suppose U = a;(0) = 100, d{t) = 2, tt — 0.1, p = 0.05, and 

g{t) — 2/(1 -h Let the unit of time be one month. First, we write 
the condition on by equating (9.10) to 0, which gives 

7T — (tt — p)e~P^^~^^^ = — . (9.15) 

9 

In doing so, we have assumed that the solution of (9.15) lies in the 
open interval (0,T). As we shall indicate later, special care needs to be 
exercised if this is not the case. 




246 



9. Maintenance and Replacement 



Substituting the data in (9.15) we have 



0.1 - = 0.025(1 + 



which simplifies to 



Then, integrating (9.3), we find 

x(t) ~ ~2t + 4(1 + t)^/^ + 96, if t < 



(9.16) 



and hence 



x{t) = -21® + 4( 1 + f *) */* + 96 - 2(t - 
= 4(1 + 1®)'/^ + 96 - 21, ifl>l®. 

Since we have assumed 0 < < T, we substitute x{T) into (9.13), and 

obtain 

4(1 + 1®)'/^ + 96 - 2T = 2/.05 = 40, 
which simplifies to 

r = 2(1 + 1®)!/^ + 28. (9.17) 

We must solve (9.16) and (9.17) simultaneously. Substituting (9.17) into 
(9.16), we find that must be a zero of the function 

h{f) = (1 + l®)i/2 - 4 + +2*1/20, (9.18) 

A simple binary search program was written to solve this equation, which 
obtained the value = 10.6. Substitution of this into (9.17) yields T = 
34.8. Since this satisfies our supposition that 0 < < T, we can conclude 

our computations. Thus, the optimal solution is to perform preventive 
maintenance at the maximum rate during the first 10.6 months, and 
thereafter not at all. The sale date is at 34.8 months after purchase. 
Figure 9.1 gives the functions x{t) and u{t) for this optimal maintenance 
and sale date policy. 

If, on the other hand, the solution of (9.16) and (9.17) did not satisfy 
our supposition, we would need to follow the procedure outline earlier in 
the section. This would result in = 0 or = T. If = 0, we would 
obtain T from (9.17), and conclude w*(t) = 0, 0 <t < T. Alternatively, 
if = T, we would need to substitute x(T) into (9.14) to obtain T. In 
this case the optimal control would be u*{t) = U, 0 <t < T. 
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x{t) 




Figure 9.1: Optimal Maintenance and Machine Resale Value 

9.1.4 An Extension 

The pure bang-bang result in the model developed above is a result of the 
linearity in the problem. The result can be enriched as in Sethi (1973b) 
by generalizing the resale value equation (9.3) as follows: 

x{t) = -d{t) A g{u(t), t), (9.19) 

where g is nondecreasing and concave in u. For this section, we will 
assume the sale date T to be fixed for simplicity and g to be strictly 
concave in u, i.e., ^ 0 ^nd guu < 0 foi' all t. Also, gt < 0 , gut < 0 , 

and ^(0, t) = 0; see Exercise 9.5 for an example of the function g(u, t). 
The standard Hamiltonian is 

H = (ttx — u)e~P^ + \[—d -I- g{u, ^)], (9.20) 

where A is given in (9.8). To maximize the Hamiltonian, we differentiate 
it with respect to u and equate the result to zero. Thus, 

= -e-"' + \Qu = 0. (9.21) 

If we let denote the solution of (9.21), then «®(t) maximizes the 
Hamiltonian (9,20) because of the concavity of g in u. Thus, for a fixed 
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T, the optimal control is 

u*(t) = sat[0, !7 ;m®(4)1. (9.22) 



To determine the direction of change in we use (9.21) and the 

value A(t) from (9.8) to obtain 



e-P* 1 

A(t) “l-(I-l)e/>('-^)- 



(9.23) 



Since tt > p, the denominator on the right-hand side of (9.23) is mono- 
tonically decreasing with time. Therefore, the right-hand side of 9.23) is 
increasing with time. Taking the time derivative of (9.23), we have 



9ut + 9uuU > 0. 



But Qut < 0 and g^u < 0^ it is therefore obvious that iPit) < 0. In order 
now to sketch the optimal control u*(t) specified in (9.22), let us define 
^ such that u^[t) >UioTt<t\ and u^{t) < 0 for t 

Then, we can rewrite the sat function in (9.22) as 



U fort€[0,ti], 

U*{t) = < u‘^(t) for f € (ii,f 2 ), 
0 iort£\t2,T]. 



(9.24) 



In (9.24), it is possible to have = 0 and/or t 2 = T. In Figure 9.2 we 
have sketched a case when > 0 and t 2 < T. 

Note that while u®(i) in Figure 9.2 is decreasing over time, the way 
it wiU decrease will depend on the natiue of the function g. Indeed, 
the shape of vP{t), while always decreasing, can be quite general. In 
particular, you will see in Exercise 9.5 that the shape of vP{t) is concave 
and, furthermore, vP{t) >0, t > 0, so that t2 — T\n that case. 



9.2 Maintenance and Replacement for a 
Machine Subject to Failure 

In Kamien and Schwartz (1971a), a related model is developed which has 
somewhat different assiunptions. They assume that the production rate 
of the machine is independent of its age, while its probabihty of failure 
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Figure 9.2: Sat Function Optimal Control 

increases with its age. Consistent with this assumption, the purpose of 
preventive maintenance in the Kamien- Schwartz model is to influence 
the failure rate of the machine rather than arrest the deterioration in 
the resale value as before. Furthermore, their model also allows for sale 
of the machine at any time, provided it is still in running condition, and 
for its disposal as junk if it breaks down for good. The optimal control 
problem therefore is to find an optimal maintenance pohcy for the period 
of ownership, and an optimal sale date at which the machine should be 
sold provided it has not yet failed. Other references to related models are 
Alam, Lynn, and Sarma (1976), Alam and Sarma (1974, 1977), Sarma 
and Alam (1975), and Gaimon and Thompson (1984a, 1989). 

9.2.1 The Model 

In order to define the Kamien-Schwartz model, we use the following 
notation: 

T = the sale date of a machine to be determined, 
u{t) = the preventive maintenance rate at time 
0 < u(t) < 1, 

R = the constant positive rate of revenue produced by a 
functioning machine independent of its age at any 
time, net of all costs except preventive maintenance, 
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P = 
L = 

B(t) = 

h{t) = 

m = 

C{u,h) = 



the constant discount rate, 

the constant positive junk value of the machine inde- 
pendent of its age at failure, 

the (exogenously specified) resale value of the machine 
at time t, if it is still functioning; B{t) < 0, 
the natural failiue rate (also termed the natrual hazard 
rate in the reliability theory); h{t) > 0, h!(t) > 0, 
the cumulative probability that the machine has failed 
by time t, 

the cost fimction depending on the preventive mainte- 
nance u when the natural failure rate is h. 



To make economic sense, an operable machine must be worth at least 
as much as an inoperable machine and its resale value should not exceed 
the present value of the potential revenue generated by the machine if it 
were to function forever. Thus, 



0 < L < B{t) < R/p, t > 0. (9.25) 

Also for all t > 0, 

u{t) € O = [0, 1]. (9.26) 

Finally, the cost of reducing the natural failure rate is assumed to be 
proportional to the natural failure rate. Specifically, we assume that 
C{u, h) = C{u) h denotes the cost of preventive maintenance u when the 
natural failure rate is h. In other words, when the natural failure rate 
is h and a controlled failme rate of h{l — u) is sought, the action of 
achieving this reduction will cost C(u)h dollars. It is assumed that 

C(0) = 0, Cu> 0, Cun > 0, for u e [0, 1]. (9.27) 



Thus, the cost of reducing the failure rate increases more than pro- 
portionately as the fractional reduction increases. But the cost of a 
given fractional reduction increases linearly with the natural failrue rate. 
Hence, these conditions imply that a given absolute reduction becomes 
increasingly more costly as the machine gets older. 

To derive the state equation for F(t), we note that F/(l — F) denotes 
the conditional probability density for the failure of machine at time f 
given that it has survived to time t. This is assumed to depend on 
two things, namely (i) the natural failure rate that governs the machine 
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in the absence of preventive maintenance, and (ii) the current rate of 
preventive maintenance. Thus, 

= -«(<)], (9.28) 

which gives the state equation 

F^h(l- u)(l - F), F(0) = 0. (9.29) 

Thus, the controlled failure rate at time t is If it = 0, 

the failure rate assumes its natural value h. As u increases, the failure 
rate decreases and drops to zero when it — 1. 

The expected present value of the machine is the sum of the expected 
present values of (i) the total revenue it produces less the total cost of 
maintenance, (ii) its junk value if it should fail, and (iii) the salvage value 
if it does not fail and is sold. That is, 

J = £ |[il - C{u)h]{l -F) + LF} dt + e-i’‘^B(T)[l - F{T)]. 

Using (9.29), we can rewrite J as follows: 

J = r - C{u)h + L(1 - u)h] (1 - F)dt + e~P^B{T) [l - F{T)] . 
Jo 

(9.30) 

The optimal control problem is to maximize J in (9.30) subject to (9.29) 
and (9.26). 

9.2.2 Optimal Policy 

The problem is similar to Model Type (f) in Table 3.3 subject to the 
free-end-point condition as in Row 1 of Table 3.1. Therefore, we follow 
the steps for solution by the maximmn principle stated in Chapter 3. 
The standard Hamiltonian is 

H = e~P\R - C{u)h + L(1 - u)h](l - F) F A[(l - u)h{l - F)], (9.31) 

and the adjoint variable satisfies 

A = e-P^[R-C{u)h-\-L{l-u)h]+\h{l-u), 

A(T) = -e~P'^B{T). 



(9.32) 
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Since T is unspecified, we apply the additional terminal condition (3.14) 
to obtain (see Exercise 9.6) 



R - C[u*{T*)]h{T*) 4- L[l - u*{T*)]h[T*] 
-[p + {1 - u*{T*)}h{T*)]B{T*) = 



(9.33) 



We will now interpret (9.33) as follows: The first two terms in 
(9.33) give the net cash infiow (revenue — cost of preventive mainte- 
nance) , to which is added the junk value L multiphed by the probability 
[1 — u* {T*)]h{T*) that the machine fails. Prom this, we subtract the 
third term which is the sum of loss of interest on resale value pB{T*) 
and the loss of the entire resale value, when the machine fails, with prob- 
ability [1 — u* (T*)]h{T*) . Thus, the left-hand side of (9.33) represents 
the marginal benefit of keeping the machine. On the other hand, the 
right-hand side term is simply the instantaneous deterioration in the re- 
sale value when the sale of the machine is postponed for a short period 
of time. Hence, equation (9.33) determining the optimal sale date is the 
usual economic condition equating marginal benefit to marginal cost. 

Next, we analyze the problem to obtain the optimal maintenance 
policy for a fixed T. If the optimal solution is in the interior, i.e., u* G 
(0, 1), then the Hamiltonian maximizing condition gives 

Hu = -e-P^h{l -F)[Cu + L + e^^A] = 0. (9.34) 

In the trivial case in which the natural failure rate h{t) is zero or when 
the machine fails with certainty by time t (i.e., F{t) ~ 1), then u*{t) = 0. 
Assume therefore > 0 and F <1. Under these conditions, we can infer 
from (9.27) and (9.34) that 



(i) Cu{0) + L + XeP^ > 0 u*{t) = 0, 

(ii) Cu{l) + L + XeP^ < 0 «*(t) = 1, 






V (9.35) 



(iii) Otherwise, CuF L XeP* = 0 determines u*{t). 



Using the terminal condition X{T) = —e P^B{T) from (9.32), we can 
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derive u*{T) satisfying (9.35): 



(i) Cu{0) > B{T) -L^ u*(T) = 0, 

(ii) C'„(l)<B(r)-L^«*(T) = l, 

(iii) Otherwise, Cu = B{T) — L=^ u*{T). 



(9.36) 



The next question is to determine how u*{t) changes over time. 
Kamien and Schwartz (1971a, 1998) have shown that u*{t) < 0; see 
Exercise 9.7. That means there exists T >t 2 >h >0 such that 



1 forte[0,ti]. 



- S u^{t) for t G (h,t 2 ), 



0 for t e {t 2 jT]. 



(9.37) 



Here vP(t) is the solution of (9.35) (iii), and it is easy to show that u^(t) < 
0. Of course, u*{T) is immediately known from (9.36). If u*{T) G (0, 1), 
it imphes ^2 = T; and if u*(T) = 1, it implies = t 2 ~ T. 

For this model, the sufficiency of the maximum principle follows from 
Theorem 2.1; see Exercise 9.8. 



9.2.3 Determination of the Sale Date 

For a fixed T, we know that the terminal optimal control u*{T) is deter- 
mined by (9.36). If this u*{T) also satisfies (9.33), we have determined 
an optimal trajectory as well as the optimal life of the machine. This, 
of course, is subject to the second-order condition since (9.33) is only 
a necessary condition for an optimal T* to satisfy. It is clear that the 
determination of T*, in most cases, will require numerical computations. 
The algorithm needs only be a simple search method because it requires 
consideration of the single variable T. 

Before we go to the next section, we remark that a business is usually 
a continuing entity and does not end at the sale date of one machine. 
Normally, an existing machine will be replaced by another which, in 
turn, will be replaced by another, and so on. The technology of the 
newer machines will in general be different from that of the existing 
machine. In what follows, we address these issues. We shall choose the 
discrete-time setting and illustrate the use of the discrete-time maximum 
principle developed in Chapter 8. 
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9.3 Chain of Machines 

We now extend the problem of maintenance and replacement to a chain of 
machines. By this we mean that given the time periods 0, 1, 2, . . , , T — 1, 
we begin with a machine purchase at the beginning of period zero. Then, 
we find an optimal number of machines, say i, and optimal times 0 < 
ti < < t£ < T oi their replacements such that the existing 

machine will be replaced by a new machine at time j = 1, 2, . . . , 
At the end of the horizon defined by the beginning of period T, the last 
machine purchased will be salvaged. Moreover, the optimal maintenance 
policy for each of the machines in the chain must be foimd. 

Two approaches to this problem have been developed in the litera- 
ture. The first attempts to solve for an infinite horizon (T — oo) with a 
simplifying assumption of identical machine lives, i.e., 

h ~ ~ h (9.38) 

for all i > 1; see Sethi (1973b). In this case = oo as well. The second 
relaxes the assumption (9.38) of identical machine lives, but then, it can 
only solve a finite horizon problem involving a finite chain of machines, 
i.e., £ is finite; see Sethi and Morton (1972) and Tapiero (1973). For 
a decision horizon formulation of this problem, see Sethi and Chand 
(1979), Chand and Sethi (1982), and Bylka, Sethi, and Sorger (1992). 

In this section, we will deal with the latter problem as analyzed by 
Sethi and Morton (1972). The problem is solved by a mixed optimization 
technique. The subproblems dealing with the maintenance policy are 
solved by appealing to the discrete maximum principle. These subprob- 
lem solutions are then incorporated into a Wagner and Whitin (1958) 
model formulation for solution of the full problem. The procedure is 
illustrated by a numerical example. 

9.3.1 The Model 

Consider buying a machine at the beginning of period s and salvaging it 
at the beginning of period t > s. Let Jst denote the present value of all 
net earnings associated with the machine. To calculate Jst we need the 
following notation ior s < k <t — 1. 

Xg = the resale value of the machine at the beginning of 
period k, 

Pg — the production quantity (in dollar value) during period 

fc. 
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Eg = the necessary expense of the ordinary maintenance (in 
dollars) during period k, 

= P^-E^, 

= the rate of preventive maintenance (in dollars) during 
period k, 

Cs = the cost of purchasing machine at the beginning of 
period 5, 

p = the periodic discount rate. 

It is required that 



0 < k^[s,t- 1]. (9.39) 

We can calculate Jst in terms of the variables and functions defined 
above: 

Jst = 2i^(l+p)-‘^-£«"(l+p)-'=-C',(l+p)-»+^(l+p)-‘. (9.40) 

k=3 k=s 

We must also have functions that will provide us with the ways in 
which states change due to the age of the machine and the amoimt 
of preventive maintenance. Also, assuming that at time s, the only 
machines available are those that are up-to-date with respect to the 
technology prevailing at 5, we can subscript these fimctions by s to reflect 
the effect of the machine’s technology on its state at a later time k. Let 
^s(w^, k) and k) be such concave functions so that we can write 

the following state equations: 

= = given, (9.41) 

As* = $,(«*, k), = (1 - 6)Cs, (9.42) 

where S is the fractional depreciation immediately after purchase of the 
machine at time s. 

To convert the problem into the Mayer form, define 






^ fit(i + p)-^ 

i=s 

Bj = gn*(l+p)-*. 



(9.43) 



(9.44) 




256 



9. Maintenance and Replacement 



Using equations (9.43) and (9.44), we can write the optimal control 
problem as follows: 

max[J,* = Al -Bl- C,(l + + x‘(l + pY^] (9.45) 

subject to 

= /^*(l + p)-^ ^’ = 0, (9.46) 

AS* = u*^(l B| = 0, (9.47) 

and the constraints (9.41), (9.42), and (9.39). 

9.3.2 Solution by the Discrete Maximum Principle 

We associate the adjoint variables and respec- 

tively with the state equations (9.46), (9,47), (9.41), and (9.42). There- 
fore, the Hamiltonian becomes 

H = AJ+1b^(1 + p)-*’ + A*+'«'‘(l + pY’‘ + A^+^^-s + A*+'4>s, (9.48) 

where the adjoint variables Ai, A 2 ,As, and A 4 satisfy the following dif- 
ference equations and terminal boundary conditions: 



> 

II 


1 

c 

1 


(9.49) 


dA\ - 


II 

<1 


= 0 , A| = 1 , 
dB\ ’ ^ 


(9.50) 


II 

<1 


= + A| = o, 


(9.51) 


> 

> 

II 


-S=«’ = 


(9.52) 



The solutions of these equations are 

At = 1 , (9.53) 

At = -1, (9.54) 

4 = ( 9 - 55 ) 

i=k 

\\ = (l + p)-*. (9.56) 

Note that A* , A 2 , and AJ are constants for a fixed machine salvage time 
t. To apply the maximum principle, we substitute (9.53)-(9.56) into the 
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Hamiltonian (9.48), collect terms containing the control variable u^, and 
rearrange and decompose H as 

H = Hi + (9.57) 

where Hi is that part of H which is independent of u'^ and 

= -u\l + p)->= + (1 + p)-'^s + (1 + pr^^s- (9.58) 

i=fc + l 

Next we apply the maximum principle to obtain the necessary con- 
dition for the optimal schedule of preventive maintenance expenditures 
in dollars. The condition of optimality is that H should be a maximum 
along the optimal path. If n* were unconstrained, this condition, given 
the concavity of and $ 5 , would be equivalent to setting the partial 
derivative of H with respect to u equal to zero, i.e., 

(9.59) 

Equation (9.59) is an equation in with the exception of the particular 
case when and are linear in u* (which will be treated later in this 
section). In general, (9.59) may or may not have a unique solution. For 
our case we will assume and to be of the form such that they 
give a unique solution for u^. One such case occurs when and are 
quadratic in u*. In this case, (9.59) is linear in and can be solved 
explicitly for a unique solution for u^. Whenever a unique solution does 
exist, let this be 

= f/i- (9.60) 

The optimal control u*’* is given as 



0 if c/* < 0 , 



«"• = *! ui. 



st 



if 0 < U^t < 17®*, 



Usk Jf f/fc > ijsk^ 



(9.61) 



9.3.3 Special Case of Bang-Bang Control 

We now treat the special case in which the problem, and therefore H, 
is linear in the control variable In this case, H can be maximized 
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simply by having the control at its maximum when the coefficient of 
in H is positive, and minimum when it is negative, i.e., the optimal 
control is of bang-bang type. 

In our problem, we obtain the special case if and assume the 



form 




(9.62) 


and 




(9.63) 



respectively, where and 4>g are given constants. Then, the coefficient 
of in if, denoted by Wg{k,t), is 



= +/>)-" U (l+p)- + ^‘(l+p)-*, (9.64) 

i=k+l 

and the optimal control u^* is given by 

u^* = bang[0, Wg(k, i)], k = s,s + 1, . . . - h (9,65) 

9.3.4 Incorporation into the Wagner- Whitin Framework 
for a Complete Solution 

Once u^* has been obtained as in (9.61) or (9.65), we can substitute it 
into (9.41) and (9.42) to obtain R^* and which in turn can be used 
in (9.40) to obtain the optimal value of the objective function denoted 
by J*^. This can be done for each pair of machine purchase time s and 
sale time t>s. 

Let gs denote the present value of the profit (discounted to period 0) 
of an optimal replacement and preventive maintenance policy for periods 
5, 5 + 1, ...,T — 1. Then, 

gs = max [J*t + 9t], 0 < s < T - 1 (9.66) 

with the boundary condition 



gr = 0. (9.67) 

The value of will give the required maximum. 

The mixed optimization technique presented here avoids many of 
the shortcomings of either pure dynamic programming or pure control 
theory formulations. Since the solution technique used to optimize a 
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given machine represents a submodule of the overall method, the pure 
dynamic programming approach may be recognized as a special case. It 
should be advantageous, however, to be able to use a methodology for the 
submodule that is most efficient for a given particular problem. Previous 
control theory formulations do not seem to be easily adaptable to the 
situation of an existing initial machine; see Sethi and Morton (1972) for 
other similar asymmetries. 

The mixed technique can also be adapted to the case of probabilistic 
technological breakthroughs (Exercise 9.10). Here the path of technolog- 
ical growth is assumed to be a tree with probabilities associated with its 
branches. The subproblems can be solved by using the maximum prin- 
ciple for stochastic networks given in Sethi and Thompson (1977). How- 
ever, the number of subproblems that must be solved increases rapidly 
with the number of branches, thus putting computational limitations on 
the general usefulness of this extension. 

Another application of the mixed technique has been used by Pekel- 
man and Sethi (1978) to obtain the optimal durations of advertising 
copies, and the optimal level of advertising expenditures for each copy. 

9.3.5 A Numerical Example 

To illustrate the procedure, a simple three-period example will be pre- 
sented and solved for the case where there is no existing machine at time 
zero. 

Machines may be bought at times 0, 1, and 2. The cost of a machine 
bought at time s is assumed to be 

C, = 1,000 + 500sl 

The discoimt rate, the fractional instantaneous depreciation at pmchase, 
and the maximum preventive maintenance per period are assumed to be 

p = 0.06, 8 = 0.25, and U = $100, 



respectively. 

Let be the net return (net of necessary maintenance) of a machine 
purchased in period s and operated in period s. We assume 

= $600, R\ = $1, 000, and rI = $1, 100. 

In a period k subsequent to the period s of machine purchase, the 
returns R^, k > s, depends on the preventive maintenance performed 
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on the machine in periods prior to period k. The incremental return 
function is given by k), which we assume to be linear. Specifically, 



where 



do = 200, di — 50, d ,2 = 100, and as = 0.5 + O.l^^. 



This means that the return in period fc on a machine purchased in period 
s goes down by an amount dg every period between s and including s, 
in which there is no preventive maintenance. This decrease can be ofl^’set 
by an amoimt proportional to the amount of preventive maintenance. 

Note that the function is assumed to be stationary over time in 
order to simplify the example. 

Let Xg be the salvage value at time k of a machine purchased at s. 
We assume 

= (1 - 6)Cs = 0.75[1, 000 + 5005^]. 

The incremental salvage value function is given by 

AxJ = + bsu^, 

where 

i O.l when 5 = 0, 1, 

0.2 when s = 2, 

and 

hs - (0.5 -0.05s). 

That is, the decrease in salvage value is a constant percentage of the pur- 
chase price if there is no preventive maintenance. With preventive main- 
tenance, the salvage value can be enhanced by a proportional amount. 

Let be the optimal value of the objective function associated with 
a machine purchased at s and sold at ^ > s -|- 1. We will now solve for 
s = 0, 1, 2, and s < i < 3, where t is an integer. 

Before we proceed, we will as in (9.64) denote by Ws{k^t)^ the coef- 
ficient of u* in the Hamiltonian Lf, i.e., 

t-i 

Ws{k^t) = —{1 p) * 4- (1 + p) ^ bs{l p) 
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The optimal control is given by (9.65). 

It is noted in passing that 

W,(k + l,t)~ Ws{k, t) = (1 + 

so that 

sgn[lTs(A; + l,i) - Ws{k,t)] = sgn[p - a^]. (9,68) 

This implies that 

>0 if (p — as) > 0, 

= 0 if (p - as) = 0, 

<0 if (p — as) < 0. 

In this example p — < 0, which means that if there is a switching in 

the preventive maintenance trajectory of a machine, the switch must be 
from $100 to $0. 




Solution of Subproblems. We now solve the subproblems for various 
values of s and t{s < t) by using the discrete maximum principle. 

s = 0, t = I 

Woia, 1) = -1 + 0.5(1.06)"! < 0. 

From (9.65) we have 

= 0 . 

Now, 

iJg = 600, 

= 600-200 = 400, 

4 = 0.75 x 1,000 = 750, 
xl = 750-0.1 x 1,000 = 650, 

Jo*i = 600 - 1,000 + 650 X (1.06)"! = $213.2. 



Similar calculations can be carried out for other subproblems. We will 
list these results. 




262 



9. Maintenance and Replacement 



s = 0,t = 2 







Wo(0,2)<0, 


Wo(l,2)<0, 






«“• = 0, 


= 0, 






J^2 == 466.9. 




5 = 0, 


t = 3 










Wo(0, 3) > 0, VVo(l, 3) < 0, Wo{2, 3) < 0, 






= 100, = 


= 100, ^ 0, 






Jo*3 = 639. 




5=1, 


t = 2 










Wi(l,2) 


< 0, 








= 0, 






Jl2 


= 559.9. 


5=1, 


t = 3 










M^i(1.3)>0, 


W^i(2,3)<0, 






= 100, 


= 0, 






Ji*3 = 1024.2. 




5 = 2, 


t = S 







W2{2,3) < 0 , 

= 0 , 

J|3 = 80. 



Wagner- Whitin Solution of the Entire Problem. With reference 
to the dynamic programming equation in (9.66) and (9.67), we have 




93 

92 



0 
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3i = max ( Ji3, Ji *2 + sal 

= max [1024.2, 559.9 + 80] 

= $1024.2, 

go = max ■^oi + 9i , Jm + 92 ] 

= max [639.0, 213.2 + 1024.2, 466.9 + 80] 

= $1237.4. 

Now we can summarize the optimal solution. The optimal number of 
machines is 2, and their optimal policies are as follows: 

First Machine Optimal Policy: 

Purchase at 5 = 0; sell at t = 1; optimal preventive maintenance 
pohcy u^* — 0. 

Second Machine Optimal Policy: 

Purchase at ^ = 1; seU at i = 3; optimal preventive maintenance 
pohcy = 100, = 0. The value of the objective fimction is J* = 

$1237.4. 



EXERCISES FOR CHAPTER 9 



9.1 Change the values of U and d{t) in Example 9.1.3 to the new values 
U = ll2 and d(t) = 3 and re-solve the problem. 

9.2 Show for the model in Section 9.1.1 that if it is optimal to have the 
maximum maintenance throughout the life of the machine, then 
its optimal life T must satisfy g(T) — 1 > 0. In particular, for the 
example in Section 9.1.3, show T < 3. 

9.3 Re-solve the example in Section 9.1.3 with a:(0) = 40. 



9.4 



Replace the maintenance effectiveness function in Example 9.1.3 
by 

S(i) = 2/(16 + 



and solve the resulting problem. 



9.5 



Let the maintenance effectiveness function in the model of Section 
9.1.4 be 

2«i/2 

(l+i)l/2- 



g{u,t) = 
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Derive the formula for u^{t) for this case. Furthermore, solve the 
problem with T = 34,8, C/ = 1, j;(0) = 100, d{t) = 2, tt = 0.1 
and p = 0.05, and compare its solution to that of the numerical 
example in Section 9.1.3. Note that the sale date T is assumed to 
be fixed in Section 9.1.4 for simplicity in exposition. 

9.6 Derive the formula in (9.33) by using (3.14). 

9 . 7 * To show that the singular control in the third alternative in (9.35) 
can be sustained, we set dHu/dt — 0 for all t for which a singular 
control obtains. That is, 

CuuU ~ Cu\p + (1 - u)h] pL- R-\- C{u)h (9.69) 

must hold on an optimal singular arc. Since = 

sgn(RHS). Show that u*(t) is nonincreasing over time. 

9.8 For the model of Section 9,2, prove that the derived Hamiltonian 
H is concave in F for each given A and t, so that the Sufficiency 
Theorem 2.1 holds. 

9.9 Verify the expression in (9.68). 

9 . 10 * Extend the formulation of the Sethi-Morton model in Section 9.3 
to allow for probabilistic technological breakthroughs. 

[Hint: see Sethi and Morton (1972) and Sethi and Thompson 
(1977)]. 

9 . 11 * Extend the Thompson model in Section 9.1 to allow for process 
discontinuities. An example of this type of machine is an airplane 
assigned to passenger transportation which may, after some de- 
terioration or obsolescence, be assigned to freight transportation 
before its eventual retirement. Formulate and analyze the prob- 
lem. 

[Hint: see Tapiero (1971)] 

9 . 12 * Let us define the state of a machine to be ‘0’ if it is working and 
T’ if it is being repaired. Let A be the breakdown rate and p be 
the service rate as in waiting-line theory, so that we have 

Pq — ~XPq + p[l — Fo)j ^o(O) = 1, 

where Po{t) is the probability that the machine is in the state 0 
at time t. Let Pi(t) = 1 — Po(^)? which is the probability that the 
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machine is in state 1 at time t. This equation along with (9.3) gives 
us two state equations. In view of the equation for Pq? we modify 
the objective function (9.2) to 

fT 

J = / [7rx{t)Po{t) ~ u(t) — kPi(t)]e ^^dt + x{T)e ^ , 

Jo 

where k characterizes the additional expenditure rate while the 
machine is being repaired. Solve this model to obtain the optimal 
control. 

[Hint: see Alam and Sarma (1974)]. 

9.13 Re-solve Exercise 2.13 when T is unspecified and it denotes the 
sale date of the machine to be determined. 




Chapter 10 

Applications to Natural 
Resources 



The rapid increase of world population is causing a corresponding in- 
crease in the demand for consumption of natural resources. As a conse- 
quence the optimal management and utilization of natural resources is 
becoming increasingly important. There are two main kinds of natural 
resource models, those involving renewable resources such as fish, food, 
timber, etc., and those involving nonrenewable or exhaustible resources 
such as petroleum, minerals, etc. 

In Section 10.1 we deal with a fishery resource model, the sole owner 
of which is considered to be a regulatory agency. The management prob- 
lem of the agency is to control the rate of fishing over time so that an 
appropriate objective function is maximized over an infinite horizon. A 
differential game extension known as the common property fishery re- 
source model is discussed in Section 12.1.3. For other applications of 
optimal control theory to renewable resource models including those in- 
volving predator-prey relationships, see Clark (1976), Goh, Leitmann, 
and Vincent (1974), and Jprgensen and Kort (1997). 

Section 10.2 deals with an optimal forest thinning model, where thin- 
ning is the process of removing some trees from the forest to improve 
its growth rate and quality. An extension to a chain of forests model is 
presented in Section 10.2.3. 

The final model presented in Section 10.3 deals with an exhaustible 
resource such as petroleum, which must be utilized optimally over a 
given horizon under the assumption that a substitute wiU take the place 
of the resource when the latter’s price becomes too high. Therefore, the 
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analysis of this section can also be viewed as a problem of the optimal 
phasing in of the expensive substitute. 

10.1 The Sole Owner Fishery Resource Model 

With the establishment of 200-mile territorial zones in the ocean for 
most coimtries having coastlines, the control of fishing in these zones 
has become highly regulated by these countries. In this sense, fishing 
in territorial waters can be considered as a sole owner fishery problem. 
On the other hand, if the citizens and commercial fishermen of a given 
country are permitted to fish freely in their territorial waters, the prob- 
lem becomes that of an open access fishery. The solutions of these two 
extreme problems are quite different, as will be shown in this section. 

10.1.1 The Dynamics of Fishery Models 

We introduce the following notation and terminology which is due to 
Clark (1976); 

p = the discount rate, 

x[t) ~ the biomass of fish population at time t, 
g{x) = the natural growth function, 
u(t) = the rate of fishing effort at time t; 0 <u <U, 
q ~ the catchabifity coefficient, 
p = the unit price of landed fish, 
c = the unit cost of effort. 

Assume that the growth frmction g is differentiable and concave, and 
it satisfies 

^(0) = 0, g{X) = 0, g{x) >0 for 0 < x < X, (10.1) 

where X denotes the carrying capacity.^ i.e., the maximum sustainable 
fish biomass. 

The model equation due to Gordon (1954) and Schaefer (1957) is 

X = g{x) — qux, x(0) = xq (10.2) 
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and the instantaneous profit rate is 

7t(x, u) = {jpqx — c)u. (10.3) 

Prom (10.1) and (10.2), it follows that x will stay in the closed interval 
0 < a; < X provided xq is in the same interval. 

An open access fishery is one in which exploitation is completely 
uncontrolled. Gordon (1954) analyzed this model, also known as the 
Gordon-Schaefer model, and showed that the fishing effort tends to reach 
an equilibrium, called bionomic equilibrium, at the level at which total 
revenue equals total cost. In other words, the so-called economic rent is 
completely dissipated. From (10.3) and (10.2), this level is simply 

= = (10.4) 

pq c 

We assrnne ui, <U. The economic basis for this result is as follows; 
If the fishing effort u > Ub’is made, then total costs exceed total revenues 
so that at least some fishermen will lose money, and eventually some will 
drop out, thus reducing the level of fishing effort. On the other hand, 
if fishing effort u < Uf, is made, then total revenues exceed total costs, 
thereby attracting additional fishermen, and increasing the fishing effort. 

The Gordon-Schaefer model is a static model which does not (in 
general) maximize the present value of the total profits which can be 
obtained from the fish resources. 



10.1.2 The Sole Owner Model 

The bionomic equilibrium solution obtained from the open access fishery 
model usually implies severe biological overfishing. Suppose a fishing 
regulatory agency is established to improve the operation of the fishing 
industry. In determining the objective of the agency, it is convenient to 
think of it as a sole owner who has complete rights to exploit the fishing 
resource. It is reasonable to assume that the agency attempts to 



maximize = J e {pqx — c)udt^ 



(10.5) 



subject to (10.2). This is the optimal control problem to be solved. 
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10.1.3 Solution by Green’s Theorem 

The solution method presented in this section generahzes the one based 
on Green’s theorem used in Section 7.2.2. Solving (10.2) for u we obtain 



g{x) ~ X 



( 10 . 6 ) 



which we substitute into (10.3), giving 



J I'OO 

I e~^^(pqx — c) 

0 



g{x) - X 



Rewriting, we have 



(10.7) 



e ^^[M{x) + N{x)x]dt, 



( 10 . 8 ) 



where 

N(x) = — p -f — and M(x) = (p — —)g{x). (10.9) 

qx qx 

We note that we can write xdt = dx so that (10.8) becomes the following 
line integral 

Jb = j [e-f^M(x)dt + e-'‘^N{x)dx], (10.10) 

where jB is a state trajectory in (x,t) space, t € [0, oo). 

In this section we are only interested in the infinite horizon solution. 
The Green’s theorem method achieves such a solution by first solving a 
finite horizon problem as in Section 7.2.2, and then determining the infi- 
nite horizon solution for which you are asked to verify that the maximum 
principle holds in Exercise 10.1. See also Sethi (1977b). 

In order to apply Green’s Theorem to (10.10), let F denote a simple 
closed curve in the (a:, t) space surrounding a region R in the space. 
Then, 



Jr = 



y* [e ^*M{x)dt + e ^^N{x)dx] 






If we let 



J ! — e ^*[pJV(x) + M' {x)\dtdx. 



I{x) = -\pN{x)+M'{x)], 



( 10 . 11 ) 
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we can rewrite (10.11) as 



Jr = J J e ^^I(x)dtdx. 



We can now conclude, as we did in Sections 7.2.2 and 7.2.4, that the 
turnpike level x is given by setting the integrand of (10,11) to 0. That 
is, 

-I{x) = b'(x) - p]{v _ £-) + ^ = 0. (10.12) 

In addition, a second-order condition must be satisfied for the solution x 
of (10.12) to be a turnpike solution; see Lemma 7.1 and the subsequent 
discussion there. The required second-order condition can be stated as: 

I{x) <0 for X < X and I{x) >0 for x > x. 

Let X be the unique solution to (10.12) satisfying the second-order condi- 
tion. The procedure can be extended to the case of nonunique solutions 
as in Sethi (1977b). 

The corresponding value u of the control which would maintain the 
fish stock level at x is g(x)Jqx. In Exercise 10.2 you are asked to show 
that X G (xb, X) and also that u < U. In Figure 10.1 optimal trajectories 
are shown for two different initial values: xq < x and 2:0 > 



X 




Figure 10.1: Optimal Pohcy for the Sole Owner Fishery Model 
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Equation (10.12) has an important economic interpretation. To see 
it, we rewrite (10.12) as 



d7r{x) _ fpqx — c\ 
dx ^\qx)' 



(10.13) 



where 

9{x) - 

7r{x) = . 

qx 

The interpretation of 7r(x) is that it is the sustainable economic rent at 
fish stock level x. This can be seen by substituting u = g{x)!qx into 
(10.3), where u = gix^jqx., obtained using (10.2), is the fishing effort 
required to maintain the fish stock at level x. Suppose we have attained 
the equilibrium level x given by (10.12), and suppose we reduce this 
level to — A by using fishing effort of /\jqx. The immediate marginal 
revenue, MR, from this action is 



MR = (pqx — c) — . 

qx 

However, this causes a decrease in the sustainable economic rent which 
equals 

7t'(x)A. 

Over the infinite future, the present value of this stream, i.e., the 
marginal cost MC, is 



MC= /~°°e~PV(x)Arft= 

Jo p 

Equating MR and MC, we obtain (10.13), which is also (10.12). 

When the discount rate is zero, equation (10,13) reduces to 

"K^x) = 0 , 

so that it wiU give the equilibrium fish stock level 5 for p = 0, which 
maximizes the instantaneous profit rate 7r(x). This is called in economics 
the golden rule level 

When p = 00, we can assume that 7 t'(j;) is bounded. From (10.13) 
we have pqx — c = 0, which gives 

X |p=::0O ~ X^) — Cjpq. 
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The latter is the bionomic equilibrium attained in the open access fishery 
solution; see (10.4). 

The sole owner solution x satisfies x > Xb = cjpq. If we regard a 
government regulatory agency as the sole owner responsible for operat- 
ing the fishery at level x^ then it can impose restrictions, such as gear 
regulations, catch limitations, etc., which increase the fishing cost c. If 
c is increased to the level pqx., then the fishery can be turned into an 
open access fishery subject to those regulations, and it will attain the 
bionomic equilibrium at level x. 



10.2 An Optimal Forest Thinning Model 

Forests are another important kind of renewable natural resources, and 
their optimal management is becoming a significant current problem. In 
Kilkki and Vaisanen (1969), a model is developed for forest growth and 
thinning in connection with Scotch Pine forests in Finland. Thinning is 
the process of removing some but not all trees prior to the clear cutting of 
the forest. Besides yielding a harvest of wood, the thinning process also 
improves the growth rate and quahty of the forest. The solution method 
employed by Kilkki and Vaisanen was based on dynamic programming. 
We shall use the maximum principle approach to solve the model; see 
Clark (1976). 

10.2.1 The Forestry Model 

We introduce the following notation: 

to = the initial age of the forest, 
p ~ the discount rate, 

x{t) = the volume of usable timber in the forest at time t, 
u{t) — the rate of thinning at time t, 

p = the constant price per unit volume of timber, 
c = the constant cost per imit volume of thinning, 

f{x) = the growth function, which is positive, concave, and 
has a unique maximum at x^\ we assume /(O) = 0, 
g{x) = the growth coefficient which is positive, decreasing 
function of time. 
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The specific function form for the forest growth used in Kilkki and 
Vaisanen (1969) is as follows: 

f{x) = 0 < X <—, 

where a is a positive constant. Note that / is concave in the relevant 
range and that x^, = l/o. They use the growth coefficient of the form 

g{t) = 

where a and b are positive constants. 

The forest growth equation is 

^ - u{t), x{to) = xo. (10.14) 

The objective function is to 

maximize | J = J e~^^{p ~ c)udt| (10.15) 

subject to (10.14) and the state and control constraints 

x{t) > 0 and u(t) > 0. (10.16) 

The control constraint in (10.16) implies that there is no replanting in 
the forest. In Section 10.2.3 we extend this model to incorporate the 
successive replantings of the forest each time it is clear cut. 

10.2.2 Determination of Optimal Thinning 

We solve the forest thinning model by using the maximmn principle. 
The Hamiltonian is 



H == {p~c)u-\- X[gf(x) - li] 


(10.17) 


with the adjoint equation 




X = xlp- gf'{x)]. 


(10.18) 


The optimal control is 




u* = bang[0, oo; p — c~X\. 


(10.19) 



The appearance of oo as an upper bound in (10.19) simply means that 
impulse control is permitted. 
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We do not use the Lagrangian form of the maximum principle to 
include constraints (10.16) because, as we shall see, the forestry problem 
has a natmal ending at a time T for which x(T) = 0. 

To get the singular control solution triple {x, A, w}, we must observe 
that X and u will be functions of time. Prom (10.19), we have 

\~c~p^ (10.20) 

which is a constant so that A = 0. Prom (10.18), 

f'{x{t)) = ^ or x(t) = f'-\p/g{t)). (10.21) 

Then, from (10.13), 

“(*) = 9(t)f{S(t)) - i(t) (10.22) 

gives the singular control. 
fix) 




Pigure 10.2: Singular Usable Timber Volume x(t) 

The solution of (10.21) can be illustrated as in Pigure 10.2. Since 
g(t) is a decreasing fimction of time, it is clear from Pigure 10.2 that 
x(t) is a decreasing fimction of time, and then by (10.22), u(t) > 0. It is 
also clear that x(T) = 0 at time T, given by 
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which, in view of f{0) = 1, gives 

f^g-(l/b)ln(p/a)^ ( 10 . 23 ) 

In Figure 10.3 we plot x{t) as a function of time t. The figme also 
contains an optimal control trajectory for the case in which xq < x[tQ). 
To determine the switching time t, we first solve (10.13) with w = 0. Let 
x{t) be the solution. Then, t is the time at which the x{t) trajectory 
intersects the x{t) curve; see Figure 10.3. 




Figure 10.3: Optimal Policy for the Forest Thinning Model when xq < 
x{to) 

For Xq > x(to), the optimal control at to will be the impulse cutting 
to bring the level from xo to x{to) instantaneously. To complete the 
infinite horizon solution, set u*{t) = 0 for t > T. In Exercise 10.10 you 
are asked to obtain A(t) for t G [0, oo). 

10.2.3 A Chain of Forests Model 

We now extend the model of Section 10.2.1 to incorporate successive 
replantings of the forest each time it is clearcut. This extension is sim- 
ilar to the chain of machines model of Section 9.3. We shall assume 
that successive plantings, sometimes called forest rotations, take place 
at equal intervals. This is similar to the assumption (9.39) employed in 
the machine replacement problem treated in Sethi (1973b). 
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Let T be the rotation period, i.e., the time from planting to clear- 
cutting which is to be determined. During the nth rotation, the dynamics 
of the forest is given by (10.13) with t G [(n— 1)T, nT] and x[(n—l)T] = 0. 
The objective function to be maximized is given by 

oo 

J(T) = ^ / e-f\p - c)udt 

k=l 

= 1 _ e-pr Jo (10.24) 

From the solution of the model of the previous section, and the 
assumption that the forest is profitable, it is obvious that 0 < T < T 
as shown in Figure 10.4. We have two cases to consider, depending on 
whether T > i or T < i. 



x{t) 




Figure 10.4: Optimal Policy for the Chain of Forests Model when T > t 
Case 1 T > i. 

From the preceding section it is easy to conclude that the optimal 
trajectory is as shown in Figure 10.4. Using the turnpike terminology of 
Chapter 7, the trajectory from 0 to A is the entry ramp to the turnpike, 
the trajectory from A to B is on the tmnpike, and the trajectory from 
5 to T is the exit ramp. With this solution we can write the J*(T) of 
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(10.24) for a given T as 

J*(T) — - — - c)udt j e~^^{p ~ c)u*{t)dt 

" Y _ e-pT Ji (^~'^ip-c)u{t)dt + e->’'^{p-c)x{T) . 

(10.25) 

Note that u*{T) = imp[^, 0; T] in the second integral is an impulse con- 
trol bringing the forest from value x{t) to 0 by a clearcutting operation; 
see Exercise 10.11. To find the optimal value of T for this case we differ- 
entiate (10.25) with respect to T, equate the result to zero, and simplify, 
obtaining (see Exercise 10.12) 

(1 - e-o^)g{T)f[x{T)\ - px{T) - p e~i>^u{t)dt = 0. (10.26) 

If the solution T lies in {t,T], keep it; otherwise set T = T. Note that 

(10.26) can also be derived by using the transversahty condition (3.14); 
see Exercise 3.5. 

Case 2 T<i. 

The optimal trajectory in this case is as shown in Figure 10.5. In the 
Vidale-Wohe advertising model of Chapter 7, a similar case occurs when 
T is small; see Figure 7.10 and compare it with Figure 10.5. The solution 
for x{T) is obtained by integrating (10.13) with w = 0 and xq = 0. Let 
this solution be denoted as x*(t). Here (10.24) becomes 

■^‘iT) = j^^(p-c)x*{T). (10.27) 

To find the optimal value of T for this case, we differentiate (10.27) and 
equate to zero. We obtain 

(e'’^ - l)g(T)f[x*(T)] - pef^x*(T) = 0. (10.28) 

If the solution lies in the interval [0, t\ keep it; otherwise set T = t. 

The optimal value T* can be obtained by computing J*(T) from both 
cases and selecting whichever is larger; see also Naslund (1969) and Sethi 
(1973c). 
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Figure 10.5: Optimal Policy for the Chain of Forests Model when T <t 

10.3 An Exhaustible Resource Model 

In the previous two sections we discussed two renewable resource mod- 
els. However, many natural resources are nonrenewable or exhaustible. 
Examples are petroleum, mineral deposits, coal, etc. Given the grow- 
ing energy shortage, the optimal production and use of these resomces 
is of immense importance to the world. The earliest important work 
in this area is due to Hotelling (1931). Since then a number of studies 
have been published such as Dasgupta and Heal (1974), Solow (1974), 
Weinstein, and Zeckhauser (1975), Manh-Himg (1974), Pindyck (1978a, 
1978b), Derzko and Sethi (1981a, 1981b), and Amit (1986). See also 
other papers in The Review of Economic Studies^ Vol. 41, Symposium 
on the Economics of Exhaustible Resources (1974). 

In this section we discuss a simple model taken from Sethi (1979a). 
This paper analyzes optimal depletion rates by maximizing a social wel- 
fare function which involves consumers’ surplus and producers’ surplus 
with various weights. Here we select a model having the equally weighted 
criterion function. 

10.3.1 Formulation of the Model 

The model will be developed under the assumption that at a high enough 
price, say p, a substitute, preferably renewable, will become available. 
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For example, if the price of fossil fuel becomes sufficiently high, solar 
energy may become an economic substitute. In the North American 
context, the resource under consideration could be crude oil and its ex- 
pensive substitute could be coal and/or tar sands; see, e.g.. Fuller and 
Vickson(l987), 

We introduce the following notation: 

p{t) — the price of the resource at time t, 

Q = f{p) is the demand function, i.e., the quantity de- 
manded at price p; f <0, f{p) > 0 for p < p, and 
/(p) = 0 for p > p, where p is the price at which the 
substitute completely replaces the resource. A typical 
graph of the demand function is shown in Figure 10.6, 

c = G{q) is the cost function; G(0) — 0, G{q) > 0 for q > 

0, G' > 0 and G" > 0 for g > 0, and G'(0) < p. The 
latter assumption makes it possible for the producers 
to make positive profit at a price p below p, 

Q(t) = the available stock or reserve of the resource at time t, 

0(0) = Oo > 0, 

p = the social discoimt rate, p > 0, 

T = the horizon time, which is the latest time at which the 
substitute will become available regardless of the price 
of the natural resource, T > 0. 

Before stating the optimal control problem, we need the following 
additional definitions and assumptions. Let 

c = G[/(p)] =g(p), (10.29) 

for which it is obvious that g{p) > 0 for p < p and g{p) = 0 for p > p. 
Let 

7t(p) = pfip) - g{v) (10.30) 

denote the profit function of the producers, i.e., the producers’ surplus. 
Let p be the smallest price at which 7r(p) is nonnegative. Assume further 
that 7t(p) is a concave function in the range [p,p] as shown in Figure 10.7. 
In the figure the point p”^ indicates the price which maximizes 7r(p). 

We also define 

V’(p) = f f{y)dy 

jp 



(10.31) 
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Figure 10.6: The Demand Function 



as the consumers’ surplus^ i.e., the area shown shaded in Figure 10.6. 
This quantity represents the total excess amoimt consumers would be 
willing to pay. In other words, consumers pay p/(p), while they would 
be willing to pay 

f yf'(y)dy = vfip) + ipiv)- 

Jp 

The instantaneous rate of consumers’ surplus and producers’ surplus is 
the sum t/^(p) + 7r(p). Let p denote the maximum of this sum, i.e., p 
solves 

+ 7t'(p) = pf'ip) - g'{p) = 0. (10.32) 

In Exercise 10.14 you will be asked to show that p < as marked in 
Figure 10.7. Later we will show that the correct second-order conditions 
hold at p. 

The optimal control problem is: 



subject to 



max 



fT 

/ [i’ijp) + ’t(p)] 

Jo 




Q = -f{p), Q(0) = Qo, 
Q(T) > 0, 



(10.33) 

(10.34) 

(10.35) 



and p G D = [p,p]. Recall that the sum '0(p) + 7r(p) is concave in p. 
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Figure 10,7: The Profit Function 

10.3.2 Solution by the Maximum Principle 

Form the current-value Hamiltonian 

H{Q,p, A) = xp{p) -t- 7t{p) X[-f{p)l (10.36) 

where A satisfies the relation 

A = pA, A(T) > 0, X{T)Q{T) = 0, (10.37) 



which implies 



i O if Q(T) > 0 is not binding, 

^ (10.38) 

if Q(T) > 0 is binding. 



To obtain the optimal control, the Hamiltonian maximizing condition, 
which is both necessary and sufficient in this case (see Theorem 2.1), is 

flJ-f 

_ = ^' + 7t' - A/' = (p - A)/' - a' = 0. (10.39) 



To show that the solution 5 (A) for p of (10.39) actually maximizes the 
Hamiltonian, it is enough to show that the second derivative of the 
Hamiltonian is negative at s{X). Differentiating (10.39) gives 



d‘^H 

dp^ 



= f- / + iP- X)f". 
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Using (10.39) we have 

0 = /'-/ + ^/". (10.40) 

Prom the definition of G in (10.29), we can obtain 









/'* 



which, when substituted into (10.40), gives 



d^H 

dp^ 



= f -G"r 



n 



(10.41) 



The right-hand side of (10.41) is strictly negative because /' < 0, and 
G” > 0 by assumption. We remark that p = s(0) using (10.32) and 
(10.39), and hence the second-order condition for p of (10.32) to give 
the maximum of H is verified. In Exercises 10.15 you are asked to show 
that 5 (A) increases from p as A increases from 0, and that 5 (A) = p 
when X = p — G'{0). 



Case 1 The constraint Q{T) > 0 is not binding. 

Prom (10.38), X{t) = 0 so that from (10.39) and (10.32), 

p* = p. (10.42) 

With this value, the total consumption of the resource is Tf{p), which 
must be < Qo so that the constraint Q{T) > 0 is not binding. Hence, 

Tf{p) < Qo (10.43) 

characterizes Case 1 and its solution is given in (10.42). 



Case 2 Tf{p) > Qq so that the constraint Q{T) > 0 is binding. 



To obtain the solution requires finding a value of X*{T) such that 



where 



f /(s[A*(r)e'’(*-^)])dt = Qo, 
Jo 



(10.44) 

(10.45) 
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The time t* , if it is less than T, is the time at which s[X*(T)eP^^* = p. 

Prom Exercise 10.15, 

= p - G'(0) (10.46) 

which, when solved for t*, gives the second argmnent of (10.45). 

One method to obtain the optimal solution is to define T as the 
longest time horizon dmring which the resource can be optimally used. 
Such a T must satisfy 

A*(T)=p-G'(0), 

and therefore, 

£ f (s [{p - G'(0)}e'-(‘-^>]) dt = Qo, (10.47) 

which is a transcendental equation for T. We now have two subcases. 

Subcase 2a T > T. The optimal control is 
✓ 

»({p- G'(0)}e'’(‘-^>) for t < T, 

> ~ (10.48) 

p for t > T. 

Clearly in this subcase, t* = T and 

A*(T) = [p - G'(0)]e"'’(^"^>. 

A sketch of (10.48) is shown in Figure 10.8. 

Subcase 2hT <T. Here the optimal price trajectory is 

P*(t) = S [A*(T)e'’<*^^>] , (10.49) 

where A* (T) is to be obtained from the transcendental equation 

£ f[s [A*(T)e'><‘-^)]) dt = Qo. (10.50) 

A sketch of (10.49) is shown in Figure 10.9. 

In Exercise 10.16 you are given specific fimctions for the exhaustible 
resource model and asked to work out exphcit optimal price trajectories 
for the model. 




10.3, An Exhaustible Resource Model 



285 



P* 




Figure 10.8: Optimal Price Trajectory for T > T 



p* 




Figure 10.9: Optimal Price Trajectory for T <T 
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EXERCISES FOR CHAPTER 10 

10.1 As an alternate derivation for the turnpike level x of (10.12), use 
the maximum principle to obtain the optimal long-run stationary 
equilibrium triple {x, tZ, A}. 

10.2 Prove that x G {xi,,X) and u < U, where x is the solution of 
(10.12) and Xb is given in (10.4). 

10.3 Obtain the turnpike level x of (10.12) for the special case g{x) = 
x{l — x), p = 2, c = g = 1, and p — 0.1. 

10.4 (a) For the Schaefer model with g(x) = rx{l — x/X) and ^ = 1, 

derive the formula for the turnpike level x of (10.12). 

(b) Allen (1973) and Clark (1976) estimated the parameters of 
the Schaefer model for the antarctic fin-whale population as 
follows: r — 0.08, X = 400, 000 whales, and Xb = 40, 000. 
Solve for x for p = 0, 0.10, and oo, 

10.5 Let 7t(x,u) — \p — c(x)]{qux) in (10.3), where c(x) is a differen- 
tiable, decreasing, and convex function. Derive an expression for 
X satisfying (10.12). 

10.6* Show that extinction is optimal if oo > p > c(0) and p > 2g'{0) in 
Exercise 10.5. 

[Hint: Use the generalized mean value theorem.] 

10.7 Let the constant price p in Exercise 10.5 be replaced by a time de- 
pendent price p{t) which is differentiable with respect to t. Derive 
the equation x corresponding to (10.12) for this nonautonomous 
problem. Furthermore, find the turnpike level x{t) satisfying the 
derived equation. 

10.8 When c(x) = 0 in Exercise 10.7, show that the analogue to (10.12) 
reduces to 

K \ P 
g{x) = p--. 

P 

Give an economic interpretation of this equation. 
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10.9 Let 7t(x,u) of Exercise 10.5 be 

7t{x^ u) = [p — c{x)] {qux) -\-V(x), 

where V{x) with V'{x) > 0 is the conservation value function, 
which measures the value to society of having a large fish stock. 
By deriving the analogue to (10.12), show that the new x is larger 
than the x in Exercise 10.5. 

10.10 Find \{t), t G [0, oo), for the infinite horizon model of Section 

10 . 2 . 2 . 



10.11 Derive (10.25) by computing the integral from T to T. 

10.12 Derive (10.26) by using the first-order condition for maximizing 
J*(T) of (10.25) with respect to T. 

10.13 Forest Fertilization Model of Naslund (1969). Consider a forestry 
model in which thinning is not allowed, and the forest is to be clear 
cut at a fixed time T. Suppose v{t) is the rate of fertilization at 
time t, so that the growth equation is 



X = r{X - x) F f{v,t), x(0) = xo, 



where x is the volume of timber, r and X are positive constants, 
and / is an increasing, differentiable, concave function of v. The 
objective function is to 



maximize 




e ^*v(t)dt 4- e ^^px{T) 



5 



where p is the price of a unit of timber and c is the unit cost of 
fertilization. Show that the optimal control v*{t) is given by solving 
the equation 

2L = £e~(p+A)(t-T)^ 
dv p 



10.14 Show that p defined in (10.32) satisfies p<p< p^. 

10.15 Show that ^(A), the solution of (10.36), increases from p as A 
increases from 0. Also show that 5 (A) — p, when A = p — G"(0). 
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10.16 For the model of Section 10.3, assume 



f{p) = 



P-P 

0 



for p <p^ 
for p > p, 



G{q) = q\ 

(a) Show that p* = \p\iT < SQo/p- 

(b) Show that T satisfies T 4- e~P'^ / p = \/ p-\- 3Qo/p- Moreover, 






p + 2 ) /3 

p if t > f , 



for T >T, and 




p\pT - 3Qo] 

3e~P^{eP'^ — 1 ) 



for T>T. 




Chapter 11 

Economic Applications 



Optimal control theory has been extensively applied to the solution of 
economic problems since the early papers that appeared in Shell (1967) 
and the works of Arrow (1968) and Shell (1969). The field is too vast 
to be surveyed in detail here, however. Several books in the area are: 
Arrow and Kurz (1970), Hadley and Kemp (1971), Takayama (1974), 
Lesourne and Leban (1982), Seierstad and Sydsseter (1987), Feichtinger 
(1988), Leonard and Long (1992), Van Hilten, Kort, and Van Loon 
(1993), Kamien and Schwartz (1998), and Dockner, J0rgensen, Long, 
and Sorger (2000). We content ourselves with the discussion of three 
simple kinds of models. 

In Section 11.1, two capital accumulation or economic growth mod- 
els are presented. In Section 11.2, we formulate and solve an epidemic 
control model. Finally we discuss in Section 11.3, a pollution control 
model. 

11.1 Models of Optimal Economic Growth 

In this section we develop two simple models of economic growth or 
capital accumulation. The earliest such model was developed by Ramsey 
(1928) for an economy having stationary population; see Exercise 11.8 
for one of his models. 

The first model treated in Section 11.1.1 is a finite horizon fixed- 
end-point model with stationary population. The problem is that of 
maximizing the present value of utility of consumption for society, and 
also accumulate a specified capital stock by the end of the horizon. 

The second model incorporates an exogenously and exponentially 
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growing population in the infinite horizon setting. The method of phase 
diagrams is used to analyze the model. 

For related discussion and extensions of these models, see Arrow and 
Kmz (1970), Burmeister and Dobell (1970), and Intriligator (1971). 

11.1.1 An Optimal Capital Accumulation Model 

Consider a one-sector economy in which the stock of capital, denoted by 
K(t), is the only factor of production. Let F(K) be the output rate of 
the economy when K is the capital stock. Assume F(0) = 0, F{K) > 
0, F'{K) > 0, and F"{K) < 0, for A > 0. The latter implies the 
diminishing marginal productivity of capital. This output can either 
be consumed or be reinvested for further accumulation of capital stock. 
Let C{t) be the amount of output allocated to consumption, and let 
I{t) = F[K{t)\ — C{t) be the amount invested. Let 6 be the constant 
rate of depreciation of capital. Then, the capital stock equation is 

k = F{K)-C-6K, i^(0) = ATo. (11.1) 

Let U{C) be society’s utility of consmnption, where we assume 
C/'(0) = oo, U'{C) > 0, and U”{C) < 0, for C > 0. Let p denote 
the social discount rate and T denote the finite horizon. Then, a govern- 
ment which is elected for a term of T years could consider the following 
problem: 

max I ~ £ e~>‘^U[C{t)]dt^ (11.2) 

subject to (11.1) and the fixed-end-point condition 

K(T) = Kt, (11.3) 

where Kt is a given positive constant. It may be noted that replacing 
(11.3) by K(T) > Kt would give the same solution. 

11.1.2 Solution by the Maximum Principle 

Form the current-value Hamiltonian as 

H = U{C) + X[F(K) - C - 6K]. (11.4) 



The adjoint equation is 




11.1. Models of Optimal Economic Growth 



291 



where a is a constant to be determined. 

The optimal control is given by 

— = t/'(C)-A = 0. (11.6) 

Since U’ifi) — oo, the solution of this condition always gives C{t) > 0. 

The economic interpretation of the Hamiltonian is straightforward: 
it consists of two terms, the first one gives the utility of current con- 
sumption. The second term gives the net investment evaluated by price 
A, which, from (11.6), reflects the marginal utility of consumption. 

For our economic system to be nm optimally, the solution must sat- 
isfy the following three conditions: 

(a) The static efficiency condition (11.6) which maximizes the value 
of the Hamiltonian at each instant of time myopically, provided \{t) is 
known. 

(b) The dynamic efficiency condition (11.5) which forces the price A 
of capital to change over time in such a way that the capital stock always 
yields a net rate of return, which is equal to the social discormt rate p. 
That is, 

dH 

dX -1 dt = pXdt. 

oK 

(c) The long-nm foresight condition, which establishes the terminal 
price A(T) of capital in such a way that exactly the terminal capital stock 
Kt is obtained at T. 

Equations (11.1), (11-3), (11.5), and (11.6) form a two-point boimd- 
ary value problem which can be solved numerically. A qualitative anal- 
ysis of this system can also be carried out by the phase diagram method 
of Chapter 7; see also Burmeister and Dobell (1970). We do not give 
details here since a similar analysis will be given for the infinite horizon 
version of this model treated in Sections 11.1.3 and 11.1.4. However, in 
Exercise 11.1 you are asked to solve a simple version of the model in 
which the TPBVP can be solved analytically. 

11.1.3 A One-Sector Model with a Growing Labor Force 

In the preceding sections of this chapter we studied the simplest capital 
accumulation model in which the population was assumed to be fixed. 
We now want to introduce a new factor labor (which for simplicity we 
treat the same as the population), which is growing exponentially at a 
fixed rate ^ > 0. It is now possible to recast the new model in terms of 
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per capita variables so that it is formally similar to the previous model. 
The introduction of the per capita variables makes it possible to treat 
the infinite horizon version of the new model. 

Let L(t) denote the amount of labor at time t. Since it is growing 
exponentially at rate g, we have 

L(t) = L(0)e^K 

Let F(K, L) be the production fimction which is assumed to be concave 
and homogeneous of degree one in K and L. We define k = K/L and 
the per capita production fimction f{k) as 

/(fe) = £(^ = F(^,l) = F(fe,l). (11.7) 

To derive the state equation for k, we note that 
K = kL kL = kL kgL. 

Substituting for K from (11.1) and defining per capita consumption c = 
C/L, we get 

k = f(k) ~c~-fk, k{0) = ko, (11.8) 

where 'y = g + 6. 

Let u(c) be the utility of per capita consumption of c, where u is 
assumed to satisfy 

u'{c) > 0 and u"{c) < 0 for c > 0. (H-9) 

As before we assume tt'(O) = oo to rule out zero consumption. The 
objective is to 

maximize | J = J e~^^u{c)d^ . (11.10) 

Note that the optimal control model defined by (11.8) and (11.10) is 
a generalization of Exercise 3.6. 

11,1.4 Solution by the Maximum Principle 

The current-value Hamiltonian is 

H = u{c) + \[f{k) - c - 7 fc]. (11.11) 

The adjoint equation is 



\ = p\ 



dH 

dk 



(p + 7)A-/(fc)A. 



( 11 . 12 ) 
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To obtain the optimal control, we differentiate (11.11) with respect to c, 
set it to zero, and solve 

u'{c) = A. (11.13) 

Let c = h{X) be the solution of (11.13). 

To show that the maximum principle is sufficient for optimality we 
will show that the derived Hamiltonian H^(k, A) is concave in k for any 
A solving (11.13); see Exercise 11.2. However, this follows immediately 
from the facts that u^{c) is positive because of (11.9) and that f{k) is 
concave because of the assumptions on F(K,L). 

Equations (11.8), (11.12), and (11.13) now constitute a complete au- 
tonomous system, since time does not enter explicitly in these equations. 
Therefore, we can use the phase diagram solution technique employed in 
Chapter 7. 

In Figure 11.1 we have drawn a phase diagram for the two equations 

k = /(/c) - /z(A) - 7fc - 0, (11.14) 

A = (p + 7)A-/(fc)A = 0, (11.15) 

obtained from (11.8), (11.12), and (11.13). In Exercise 11.3 you are asked 
to show that the graphs of fc = 0 and A = 0 are as shown in Figure 11.1. 
The point of intersection of these two graphs is (fc, A). 



A 




Figure 11.1: Phase Diagram for the Optimal Growth Model 
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The two graphs divide the plane into four regions, I, II, III, and IV, 
as marked in Figure 11.1. To the left of the vertical line A = 0, k < k 
and /0 + 7 < f'{k) so that A < 0 from (11.8). Therefore, A is decreasing, 
which is indicated by the downward pointing arrows in Regions I and 
P/. Similarly, to the right of the vertical line, in Regions II and III, the 
arrows are pointed upward because A is increasing. In Exercise 11.4 you 
are asked to show that the horizontal arrows, which indicate the direction 
of change in fc, point to the right above the k — 0 curve, i.e., in Regions I 
and II, and they point to the left in Regions III and IV which are below 
the fc = 0 curve. 

The point (k, A) represents the optimal long-nm stationary equilib- 
rium. The values of k and A were obtained in Exercise 11.3. We now 
want to see if there is a path satisfying the maximum principle which 
converges to the equilibrium. 

Clearly such a path cannot start in Regions II and IV, because the 
directions of the arrows in these areas point away from (fc, A). For ko <k, 
the value of Aq (if any) must be selected so that (fco, Aq) is in Region I. 
For ko >k, on the other hand, the point (fco> Aq) must be chosen to be in 
Region III. We analyze the case ko <k only, and show that there exists 
a unique Aq associated with the given ko. The locus of such {ko, Aq) is 
shown by the dotted curve in Figure 11.1. 

In Region I, k(t) is an increasing function of t as indicated by the hor- 
izontal right-directed arrow. Therefore, we can replace the independent 
variable t by k, and then from (11.14) and (11.15), 



d(lnA) 

dk 



1 dX' j dk _ f{k)-{p-\-^) 

X dt _ I dt h{X) + 7fc — f{k ) ' 



(11.16) 



For k < k, the right-hand side of (11.16) is negative, and since h{X) 
decreases with A, we have d{lnX)/dk increasing with A. 

We show next that there can be at most one trajectory for an initial 
capital ko < k. Assume to the contrary that Xi(k) and X2{k) are two 
paths leading to (k, A) and are such that the selected initial values satisfy 
Ai(A^o) > X2{ko). Since d{lnX)/dk increases with A, 

dln[Ai(fc)/A 2 (fc)] _ dlnAi(fc) d]nX 2 {k) 

dk dk dk 

whenever Ai(fc) > A 2 (A;). This inequahty clearly holds at ko, and by 
(11.16), Xi{k)/X 2 {k) increases at ko. This in turn implies that the in- 
equality holds at ko + s, where e > 0 is small. Now replace kohy ko + s 
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and repeat the argument. Thus, the ratio Ai(fc)/A 2 (fc) increases as k 
increases so that Ai(fc) and A 2 (fc) cannot both converge to A as fc — > fc. 

To show that for ko < k, there exists a Aq such that the trajectory 
converges to (fc, A), note that for some starting values of the adjoint vari- 
able, the resulting trajectory (k, A) enters Region II and then diverges, 
while for others it enters Region IV and diverges. By continuity, there 
exists a starting value Aq such that the resulting trajectory (/c, A) con- 
verges to (fe, A). 

Similar arguments hold for the case ko > k, which we therefore omit. 

11.2 A Model of Optimal Epidemic Control 

Certain infectious epidemic diseases are seasonal in nature. Examples 
are the common cold, flu, and certain children’s diseases. When it is 
beneficial to do so, control measures are taken to alleviate the effects 
of these diseases. Here we discuss a simple control model due to Sethi 
(1974c) for analyzing the epidemic problem. Related problems have been 
treated by Sethi and Staats (1978), Sethi (1978d), and Francis (1997). 
See Wickwire (1977) for a good survey of optimal control theory applied 
to the control of pest infestations and epidemics, and Swan (1984) for 
applications to biomedicine. 

11.2.1 Formulation of the Model 

Let N be the total fixed population. Let x(t) be the number of infectives 
at time t so that the remaining N — x(t) is the number of susceptibles. 
To keep the model simple, assume that no immunity is acquired so that 
when infected people are cured, they become susceptible again. The 
state equation governing the dynamics of the epidemic spread in the 
population is 

X = j3x(N — x) — vx, x(0) — xo, (ll'l’J') 

where /? is a positive constant termed infectivity of the disease, and v 
is a control variable reflecting the level of medical program effort. Note 
that x{t) is in [0, N] for all t > 0 if j;o is in that interval. 

The objective of the control problem is to minimize the present value 
of the cost stream up to a horizon time T, which marks the end of the 
season for that disease. Let C denote the unit social cost per infective, 
let K denote the cost of control per unit level of program effort, and let 
Q denote the capability of the health care delivery system providing an 
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upper bound on v. The optimal control problem is: 



subject to (11.17), the terminal constraint that 



and the control constraint 



x(T) = XT, 



0<v<Q. 



11.2.2 Solution by Green’s Theorem 

Rewriting (11.17) as 

vdt = l^x(N — x)dt — dx]!x 
and substituting into (11.18) yields the line integral 



(11.19) 



Jr = - ^Cx + K!3{N - x)\e-i'^dt - , (11.20) 

where F is a path from xq to xt in the (t, x)-space. Let Fi and F 2 be 
two such paths from xq to xt, and let R be the region enclosed by Fi 
and F 2 . By Green’s theorem, we can write 

Jri-r2 = Jri - Jp2 = j J - - C + KI3 e“'’‘dtdx. (11.21) 

To obtain the singular control we set the integrand of (11.21) equal to 
zero, as we did in Chapter 7. This yields 



C/X-/3 e' 

where 0 = Cf K — p. Define the singular state x^ as follows: 



p/0 ii0<p/0<N, 



N otherwise. 



( 11 . 22 ) 



(11.23) 
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The corresponding singular control level 



= j3(N - x^) 



^{N~p/9) iiO<p/e<N, 

^ ^ ' (11.24) 

0 otherwise. 



We will show that is the turnpike level of infectives. It is instructive 
to interpret (11.23) and (11.24) for the various cases. If pjQ > 0, then 
0 > 0 so that CjK > (3. Here the smaller the ratio CfK., the larger 
the turnpike level x*®, and therefore, the smaller the medical program 
effort should be. In other words, the smaller the social cost per infec- 
tive and/or the larger the treatment cost per infective, the smaller the 
medical program effort should be. 

When p/6 < 0, you are asked to show in Exercise 11.6 that x^ = N 
in the case C/K < l3, which means the ratio of the social cost to the 
treatment cost is smaller than the infectivity coefficient. Therefore, in 
this case when there is no terminal constraint, the optimal trajectory 
involves no treatment effort. An example of this case is the common 
cold where the social cost is low and treatment cost is high. 

The optimal control for the fortuitous case when xt ~ x^ is 



Q if x(t) > x^^ 






0 if x{t) < x^. 



(11.25) 



When xx ^ x^, there are two cases to consider. For simplicity of 
exposition we assume a:o > x^ and T and Q to be large. 



Case 1: xt > x^ 



The optimal trajectory is shown in Figure 11.2. In Exercise 11.5 you 
are asked to show its optimality by using Green’s theorem. 

Case 2: xt < 

The optimal trajectory is shown in Figure 11.3. It can be shown that 
X goes asymptotically to N — Q//3 if v = Q. The level is marked in 
Figure 11,3, 
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The optimal control shown in Figures 11.2 and 11.3 assumes 0 < 
< N. It also assumes that T is large so that the trajectory will spend 
some time on the turnpike and Q is large so that > N — Q/j3. The 
graphs are drawn for rro > x^ and x^ < N/2; for all other cases see Sethi 
(1974c). 



X 




Figure 11.2: Optimal Trajectory when xt > x^ 



X 




Figure 11.3: Optimal Trajectory when xt < x' 




11,3, A Pollution Control Model 



299 



11.3 A Pollution Control Model 

In this section we shall describe a simple pollution control model due to 
Keeler, Spence, and Zeckhauser (1971). We shall describe this model in 
terms of an economic system in which labor is the only primary factor 
of production, which is allocated between food production and DDT 
production. Once produced (and used) DDT is a pollutant which can 
only be reduced by natural decay. However, DDT is a secondary factor 
of production which, along with labor, determines the food output. The 
objective of the society is to maximize the total present value of the 
utility of food less the disutility of pollution due to the DDT use. 



11.3.1 Model Formulation 



We introduce the following notation: 



L = the total labor force, assumed to be constant for sim- 

plicity, 

V = the amount of labor used for DDT production, 



L ~ V — the amoimt of labor used for food production, 

P — the stock of pollution at time t, 



a{v) 

6 

C{v) 



9{C) 

h{P) 



= the rate of DDT output; a(0) = 0, a' > 0, o!^ < 0, 
for u > 0, 

= the natural exponential decay rate of DDT pollution, 

= f[L ~ V, a('u)] = the rate of food output; C{v) is con- 
cave, 0(0) > 0, C{L) = 0; C{v) attains a unique 
maximum eX v = V >0; see Figure 11.4. Note that 
a sufficient condition for C{v) to be strictly concave 
is /i 2 > 0 along with the usual concavity and mono- 
tonicity conditions on /, 

= the utility of consumption; 5 f'( 0 ) = oo, g' > 0, 

9” < 0, 

= the disutility of pollution; /i'(0) = 0, h' > 0, h” > 0. 



The optimal control problem is: 



max 



\j = e-'>%{C{v))-h{P)]dt^ 



(11.26) 
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C(v) 




0 V L 



Figure 11.4: Food Output Function 



subject to 

P = a{v)-6P, P(0) = Po, (11.27) 

Q<v<L. (11.28) 

From Figure 11.4 it is obvious that v is at most F, since the production 
of DDT beyond that level decreases food production as well as increases 
DDT pollution. Hence, (11.28) can be reduced to simply 

v>0. (11.29) 

11.3.2 Solution by the Maximum Principle 

Form the current-value Lagrangian 

L = g[C{v)] - h{P) + A[a(?;) - 6P] -h pv (11.30) 

using (11.26), (11.27) and (11.29), where 

A = (p + ^)A + /i'(P), (11.31) 

and 

> 0 and pv — 0. (11.32) 

The optimal solution is given by 

^ = ^{C{v)]C'{v) + Xa'{v)+ix = 0. (11.33) 
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Since the derived Hamiltonian is concave, conditions (11.30)-(11.33) to- 
gether with 

lim A(^) = A = constant (11.34) 

t— »-oo 

are sufficient for optimality; see Theorem 2.1 and Section 2.4. The phase 
diagram analysis presented below gives A(t) satisfying (11.34). 



11.3.3 Phase Diagram Analysis 

Since V(0) = 0, ^(0) = oo, and v > 0, it pays to produce some posi- 
tive amoimt of DDT in equilibrium. Therefore, the equilibrium value of 
the Lagrange multiplier is zero, i.e., ft = 0. Prom (11.27), (11.31) and 
(11.33), we get the equilibrium values P, A, and v as follows: 



5 _ 

<5 ’ 

, h'(P) </lC(v)]C'(v) 

p-\-6 a'(v) 



(11.35) 

(11.36) 



Prom (11.36) and the assumptions on the derivatives of g, C and a, we 
know that A < 0. Prom this and (11.31), we conclude that \{t) is always 
negative. The economic interpretation of A is that —A is the imputed 
cost of pollution. Let v — ^(A) denote the solution of (11.33) with ^ = 0. 
On accormt of (11.29), define 



V* = max[0, $(A)]. 



(11.37) 



We know from the interpretation of A that when A increases, the imputed 
cost of pollution decreases, which can justify an increase in the DDT 
production to ensure an increased food output. Thus, it is reasonable to 
assume that 




0 , 



and we will make this assumption. It follows that there exists a unique 
A*^ such that ^(A^^) = 0, $(A) < 0 for A < A^ and ^>(A) > 0 for A > A^. 

To construct the phase diagram, we must plot P == 0 and A = 0. 
These are 

^ a{v*) _ a[max{0, #(A)}] 

6 6 



h'(P) = -{p + S)X. 



(11.38) 

(11.39) 
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P 




Figure 11.5: Phase Diagram for the Pollution Control Model 



Observe that the assmnption /i'(0) = 0 implies that the graph of (11.39) 
passes through the origin. Differentiating these equations with respect 
to A and using (11.37), we obtain 



dP oi'iv) dv 

— — > 0 

d\ 6 dX 

as the slope of (11.38), and 

dP (P + ^) 

dX h"{P) 



(11.40) 



(11.41) 



as the slope of (11.39). 

Using (11.35), (11.36), (11.40), and (11.41), we can draw (11.38) and 
(11.39) in the (A, P)-space as shown in Figure 11.5, 

The intersection point (A, P) of these curves denotes the equilibrium 
levels for the adjoint variable and the pollution stock, respectively. From 
arguments similar to those in Section 11.1.4, it can be shown that there 
exists an optimal path (shown dotted in the figure) converging to the 
equilibrium (A,P). 

Given A^ as the intersection of the P — 0 curve and the horizontal 
axis, the corresponding ordinate P^ on the optimal trajectory is the 
related pollution stock level. The significance of P^ is that if the existing 
pollution stock is larger than P^, then the optimal control is v* = 0, 
meaning no DDT is produced. 
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Given an initial level of pollution Pq, the optimal trajectory curve in 
Figme 11.5 provides the initial value Aq of the adjoint variable. With 
these initial values, the optimal trajectory is determined by (11.27), 
(11.31), and (11.37). U Pq > , as shown in Figure 11.5, then v* = 0 

until such time that the natural decay of pollution stock has reduced 
it to P^. At that time the adjoint variable has increased to the value 
A^. The optimal control is v* = 0(A) from this time on, and the path 
converges to (A, F). 

At equilibrium, v = 0(X) > 0, which implies that it is optimal to 
produce some DDT forever in the long run. The only time when its 
production is not optimal is at the beginning when the pollution stock 
is higher than F^. 

It is important to examine the effects of changes in the parameters on 
the optimal path. In particular, you are asked in Exercise 11.7 to show 
that an increase in the natmal rate of decay of pollution, 6, will increase 
F^. That is, the higher is the rate of decay, the higher is the level of 
pollution stock at which the pollutant’s production is banned. For DDT, 
6 is small so that its complete ban, which has actually occurred, may 
not be far from the optimal policy. 

Here we have presented a very simple model of pollution in which 
the problem was to choose an optimal production process. Models in 
which the control variable to determine is the optimal amoimt to spend 
in reducing the pollution output of an existing dirty process have also 
been formulated; see Wright (1974) and Sethi (1977d). For yet other 
related models, see Luptacik and Schubert (1982), Hartl and Luptacik 
(1992), and Hartl and Kort (1996a, 1996b, 1996c, 1977). 

11.4 Miscellaneous Applications 

The number of papers which apply control theory to problems in eco- 
nomics and management science is now so large that it is impossible to 
cover them in detail within the confines of a single book. We satisfy 
ourselves by listing selected references with a brief indication of their 
contents. 

For control theory apphcations to economics, see Tu (1969) and 
Southwick and Zionts (1974) for optimal educational investments, 
Kamien and Schwartz (1971b) for limit pricing and imcertain entry, 
Treadway (1970) for adjustment costs in the theory of competitive firms, 
Vousden (1974) for international trade, Harris (1976) for money demand 
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with transaction costs, Raviv (1979) for the design of an optimal insur- 
ance policy, Sethi and McGuire (1977) for optimal training and hetero- 
geneous labor, Arthur and McNicoll (1977) for population policy, Brito 
and Oakland (1977) for optimal income tax, Thompson (1982a, 1982b) 
for continuous expanding economies, Thepot (1983) for investment and 
marketing policies in a duopoly, Verheyen (1985) for a theory of firm 
under government regulations, Hartl and Mehlmann (1986) for renu- 
meration patterns for medical services, Schijndel (1986) for dynamic 
shareholder behavior under personal taxation, Hartl and Kort (1997) 
for optimal input substitution in response to environmental constraints, 
and Feichtinger, Hartl, Haunschmied, and Kort (1998) for optimal crack- 
downs on a drug market. 

For control theory applications to management science and operar 
tions research, see Nelson (1960) for labor assignments. Fan and Wang 
(1964), Charnes and Kortanek (1966), Tapiero and Soliman (1972) and 
Bookbinder and Sethi (1980) for distribution and transportation applica- 
tions, Nepomiastchy (1970) and Zimin and Ivanilov (1971) for scheduling 
and network planning problems, Lucas (1971) for research and develop- 
ment, Legey, Ripper, and Varaiya (1973) for city congestion problems, 
Taylor (1974) for warfare models, Mehra (1975) for national settlement 
planning, Kalish (1983) for pricing with dynamic demand and production 
costs, Kalish and Lilien (1983) for optimal price subsidy for accelerating 
diffusion of innovation, Gaimon (1986c) for optimal acquisition of new 
technology, Dockner and J0rgensen (1988) and Jedidi, Eliashberg, and 
DeSarbo (1989) for optimal pricing and/or advertising for monopolis- 
tic diffusion model, Hartl and J0rgensen (1985) for manpower planning, 
Ringbeck (1985) for optimal quality and advertising under asymmetric 
information, Hartl and Krauth (1989) for optimal production mix, Hartl, 
Feichtinger, and Kirakossian (1992) for optimal recycling of tailings for 
production of building materials, and Gaimon (1997) for planning for 
information technology. 

Finally, we conclude this section by citing a series of rather unusual 
but humorous applications of optimal control theory that began with 
the Sethi (1979b) paper on optimal pilfering policies for dynamic con- 
tinuous thieves. These are Hartl and Mehlmann (1982, 1983) and Hartl, 
Mehlmann, and Novak (1992) on optimal blood consiunption by vam- 
pires, Hartl and Mehlmann (1986) on remuneration patterns for medi- 
cal services, Hartl and J0rgensen (1988, 1990) on optimal shdemanship 
at conferences, J0rgensen (1992) on the dynamics of extramarital af- 
fairs, and Feichtinger, J0rgensen, and Novak (1999) on Petrarch’s Can- 
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zoniere: rational addiction and amorous cycles. See also the monograph 
by Mehlmann (1997) on unusual and humorous applications of differen- 
tial games. 



EXERCISES FOR CHAPTER 11 

11.1 For the model treated in Sections 11.1.1 and 11.1.2, assume 

F{K) = bK and U{C) = {C - C)^-^/{l-9), where 0 < < 

1, C > 0 a constant, and b — 5 > 0 a constant satisfying 
{b — ^)(1 — 6) < p < b — 6. Replace b — 6 = p and assume 0 = 1/2 
for simplicity. Also assume that Kq€^^ + (7(1 — e^^)//? > Kt 
for the problem to be well-posed (note that the left-hand side of 
this inequality is the amount of capital at T associated with the 
consumption rate C) . Solve the new model. 

11.2 Obtain the expression for the derived Hamiltonian from 

equation (11.11). 

11.3 (a) Obtain the value of k in Figure 11.1 from equation (11.15). 

(b) Show that the graph of fc = 0 starts from +co when k = 0, 
decreases to a minimum of A at k, and then increases. Also 
obtain the expression for A. 

(c) Show that k < k. 

11.4 Show that the directions of the horizontal arrows above and below 
the fc ~ 0 curve are as drawn in Figure 11.1. 

11.5 Show that the trajectory xqBLxt shown in Figure 11.2 is optimal 
for the epidemic model under the stated assumptions. Assmne 
0 <x^ < N. 

11.6 In (11.23), show by using Green’s theorem that x^ = N if pjd < 0, 

11.7 Show that the of Figure 11.5 increases as b in equation (11.27) 
increases. 
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11 . 8 * A variation of the optimal capital accumulation model, known as 
Ramsey’s model, is: 

max “ J b^{^) ~ B]dt 

subject to 

k = f{k) — c — 6k, k{0) = ko, 

where 

B = supw(c) > 0 

c>0 

is the so-called Bliss 'point, 

lim u[c{t)] = B 

so that the integral in the objective function converges, and 
lim u'[c{t)] = 0. See Ramsey (1928). 

t— ^oo 

(a) Show that the optimal capital stock trajectory satisfies the 
differential equation 

u'{f{k) -6k-k)k = B- u{f{k) -6k- k). 

(b) Prom part (a), derive Ra'msey^s rule 

(c) Assume u{c) = 2c — c^/B and f{k) = ak, where 'y = a — 
6 > 0 and 7 < 5/fco < 27 . Show that the optimal feedback 
consumption rule is 

c*{k) = 2'yk-B 

and the optimal capital trajectory k* is given by 

r (i) = -{B- 'yko)e->% 

7 




Chapter 12 

Differential Games, 
Distributed Systems, and 
Impulse Control 



In previous chapters, we were mainly concerned with the optimal control 
problems formulated in Chapters 3 and 4 and their applications to vari- 
ous functional areas of management and to some problems in economics. 
These problems were described by a single objective function (or a single 
decision maker) and a set of ordinary differential equations, called the 
state equations, defined in a deterministic framework. 

In this chapter, we deal with generalizations of the (ordinary) deter- 
ministic optimal control problems that can be made in one or more of 
the following directions. See Chapter 13 for stochastic optimal control 
problems. 

There may be more than one decision maker, each having separate 
objective functions which each is trying to maximize, subject to a set 
of differential equations. This extension of the optimal control theory is 
referred to as the theory of differential games. Section 12. 1 contains a 
brief introduction to differential games along with an application. 

Another extension replaces the system of ordinary diflPerential equar 
tions by a set of partial differential equations. These come under the 
classification of distributed parameter systems and are treated in Section 
12 . 2 . 

Finally in Section 12.3, we treat the theory of impulse control This 
control is developed to deal with systems which, in addition to conven- 
tional controls, allow a controller to make discrete changes in the state 
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variables at selected instants of time in an optimal fashion. This theory 
is especially useful in dealing with inventory problems such as dynamic 
lot size problems. It also allow a setup cost associated with the discrete 
changes in the state variables to be charged to such discrete actions as 
setting up a machine tool from making one product to making another, 
opening a warehouse, etc. 

Extensions of optimal control problems where uncertainties are 
present will be discussed in the next chapter. 

12.1 Differential Games 

The study of differential games was initiated by Isaacs (1965). After the 
development of Pontryagin’s maximum principle, it became clear that 
there was a connection between differential games and optimal control 
theory. In fact, differential game problems represent a generalization of 
optimal control problems in cases where there are more than one con- 
troller or player. However, differential games are conceptually far more 
complex than optimal control problems in the sense that it is no longer 
obvious what constitutes a solution; see Starr and Ho (1969), Ho (1970), 
Varaiya (1970), Friedman (1971), Leitmann (1974), Case (1979), Selten 
(1975), Basar (1986), Mehlmann (1988), Berkovitz (1994), and Dock- 
ner, J0rgensen, Long, and Sorger (2000). Indeed, there are a number 
of different types of solutions such as minimax, Nash, Pareto-optimal, 
along with possibilities of cooperation and bargaining; see, e.g., Tolwin- 
ski (1982) and Haurie, Tolwinski, and Leitmann (1983). We will confine 
ourselves to minimax solutions for zero-sum differential games and Nash 
solutions for nonzero-sum games. 

12.1.1 Two Person Zero-Sum DiflTerential Games 

Consider the state equation 

X = f{x,u,v,t), a:(0) = iPo, (12.1) 

where we may assume all variables to be scalar for the time being. Ex- 
tension to the vector case simply requires appropriate reinterpretations 
of each of the variables and the equations. In this equation, we let u 
and V denote the controls applied by players 1 and 2, respectively. We 
assiune that 



u{t)eu, v{t)ev, te[0,T] 
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where U and V are convex sets in E^. Consider further the objective 
function ^ 

J{u, v) = S[x{T)] + [ F{x, u, V, t)dt, (12.2) 

which player 1 wants to maximize and player 2 wants to minimize. Since 
the gain of player 1 represents a loss to player 2, such games are appro- 
priately termed zero-sum games. Clearly, we are looking for admissible 
control trajectories u* and v* such that 

J{u*,v) > J{u*,v*) > J{uy). (12.3) 

The solution (w*,t;*) is known as the minimax solution. Here u* and v* 
stand for t € [0,T], and v*{t), t € [0,T], respectively. 

The necessary conditions for u* and v* to satisfy (12.3) are given by 
an extension of the maximum principle. To obtain these conditions, we 
form the Hamiltonian 

H=:F^\f (12.4) 

with the adjoint variable A satisfying the equation 

A = \{T)=Sy{T)]. (12.5) 

The necessary condition for trajectories u* and v* to be a minimax so- 
lution is that for t G [0, T], 

H(x*{t),u*(t),v*(t),X*{t),t) =mmmaxH{x*{t),u^v,X*{t),t), (12.6) 

vev ueu 

which can also be stated, with suppression of (t), as 

H(x\u*,v,X*,t) > H(x\u%v*,X*,t) > H{x\u,v*,X\t) (12.7) 

for u gU and v gV. Note that (u*,v*) is a saddle point of the Hamil- 
tonian function H. 

Note that if u and v are unconstrained, i.e., when, U V — E^., 
condition (12.6) reduces to the first-order necessary conditions 

Hu = Q and = 0, (12.8) 

and the second-order conditions are 

Huu < 0 and Hyy > 0. (12.9) 

We now turn to the treatment of nonzero-sum differential games. 
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12.1.2 Nonzero-Sum Differential Games 



In this section, let us assume that we have N players where N >2. Let 
u* € C/*, i = 1,2, N, represent the control variable for the zth player, 
where t/* is the set of controls from which the ith player can choose. Let 
the state equation be defined as 

X = f(x,u^,u^, ,t). (12.10) 

Let J*, defined by 

fT 

— S'^[x(T)] + I F^{x,v} , ...,u^ ,t)dt, (12.11) 

Jo 

denote the objective function which the ith player wants to maximize. In 
this case, a Nash solution is defined by a set of N admissible trajectories 

( 12 . 12 ) 



which have the property that 






(12.13) 



for i = 1,2,..., AT. 

To obtain the necessary conditions for a Nash solution for nonzero- 
sum differential games, we must make a distinction between open-loop 
and closed-loop controls. 



Open-Loop Nash Solution 

The open-loop Nash solution is defined when (12.12) is given as func- 
tions of time satisfying (12.13). To obtain the maximum principle type 
conditions for such solutions to be a Nash solution, let us define the 
Hamiltonian fimctions 

W = Nf (12.14) 

for i = 1,2 , ..., N, with A* satisfying 

r = A'(T) = S-,[x(T)]. (12,15) 

The Nash control u^* for the ith player is obtained by maximizing the 
ith Hamiltonian with respect to u^, i.e., u^* must satisfy 

A*, 1) > 

A*, i), t 6 [0,T], 

(12.16) 
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for all w* € i = 1, 2, N. 

Deal, Sethi, and Thompson (1979) formulated and solved an adver- 
tising game with two players and obtained the open-loop Nash solution 
by solving a two-point boundary value problem. In Exercise 12.1, you 
are asked to formulate their problem. See also Deal (1979). 

Closed-Loop Nash Solution 

A closed-loop Nash solution is defined when (12.12) is defined in 
terms of the state of the system. To avoid confusion, we let 

= i=l,2,...,N. (12.17) 

For these controls to represent a Nash strategy, we must recognize the de- 
pendence of the other players’ actions on the state variable x. Therefore, 
we need to replace the adjoint equation (12.15) by 

X* = -H,- £ (12.18) 

j=l,j^i 

The presence of the summation term in (12.18) makes the necessary 
condition for the closed-loop solution virtually useless for deriving com- 
putational algorithms; see Starr and Ho (1969). It is, however, possible 
to use a dynamic programming approach for solving extremely simple 
nonzero-smn games, which require the solution of a partial differential 
equation. In Exercise 12.2, you are asked to formulate this partial dif- 
ferential equation for N = 2. 

Note that the troublesome summation term in (12.18) is absent in 
three important cases: (a) in optimal control problems [N = 1) since 
Hu'^x = 0 , (b) in two-person zero-sum games because so 

that ~ C) and H^iul = — = 0, and (c) in open- 

loop nonzero-sum games because tij == 0. It certainly is to be expected, 
therefore, that the closed-loop and open- loop Nash solutions are going 
to be different, in general. This can be shown explicitly for the linear- 
quadratic case. 

We conclude this section by providing an interpretation to the adjoint 
variable A^. It is the sensitivity of ith player’s profits to a perturbation in 
the state vector. If the other players are using closed-loop (i.e., feedback) 
strategies, any perturbation Sx in the state vector causes them to revise 
their controls by the amount O^Sx. If the ith Hamiltonian W were 
already extremized with respect to u^, j ^ i, this would not affect the 
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zth player’s profit; but since dW I du^ ^ 0 for i ^ j, the reactions of the 
other players to the perturbation influence the zth player’s profit, and 
the zth player must account for this effect in considering variations of 
the trajectory. 



12.1.3 An Application to the Common-Property Fishery 
Resources 

Consider extending the fishery model of Section 10.1 by assiuning that 
there are two producers having rmrestricted rights to exploit the fish 
stock in competition with each other. This gives rise to a nonzero-sum 
differential game analyzed by Clark (1976). 

Equation (10.2) is modified by 

X = g{x) — q^u^x — q‘^v?x, a;(0) = xq, (12.19) 

where u*(t) represents the rate of fishing effort and q^u^x is the rate of 
catch for the zth producer, i = 1,2. The control constraints are 

fl<u\t)<U\ i = l,2, (12.20) 



the state constraints are 

x{t) > 0, (12.21) 

and the objective function for the ith producer is the total present value 
of his profits, namely, 

roo 

J*= {p*q*x-c*)u*e~^*dt, i=l,2. (12.22) 

Jo 

To find the Nash solution for this model, we let x^ denote the turn- 
pike (or optimal biomass) level given by (10.12) on the assumption that 
the zth producer is the sole-owner of the fishery. Let the bionomic equi- 
librium x\ for producer i be defined by (10.4), i.e.. 




(12.23) 



As shown in Exercise 10.2, x^ < x^. If the other producer is not fishing, 
then producer i can maintain x\ by making the fishing effort 

_ 9{4)p\ 

“i> - — A — ’ 



(12.24) 
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here we have assumed Ui to be sufficiently large so that ul < W. We 
also assume that 

x\ < xl (12.25) 

which means that producer 1 is more efficient than producer 2, i.e., 
producer 1 can make a positive profit at any level in the interval {xl, x^], 
while producer 2 loses money in the same interval, except at x^, where 
he breaks even. For x > x^, both producers make positive profits. 

Since > ul by assumption, producer 1 has the capability of driving 
the fish stock to a level down to or below xl which, by (12.25), is less 
than x^. This implies that producer 2 cannot operate at a sustained level 
above x^; and at a sustained level below x^, he cannot make a profit. 
Hence, his optimal policy is bang-bang: 



u^*(x) = 



if a: > x‘1, 
0 if X < x^. 



(12.26) 



As far as producer 1 is concerned, he wants to attain his turnpike level 
x^ if x^ < x^. If x^ > xf and if xq > x^, then from (12.26) producer 2 
will fish at his maximum rate imtil the fish stock is driven to x^. At this 
level it is optimal for producer 1 to fish at a rate which maintains the 
fish stock at level x^ in order to keep producer 2 from fishing. Thus, the 
optimal policy for producer 1 can be stated as 



ifx>x^ 



u^*{x) 




if x^ < xl, 



0 if X < x^ 



if X > xl 




0 if X < 



if x^ > xl. 



(12.27) 



(12.28) 



The formal proof that policies (12.26)-(12.28) give a Nash solution 
requires direct verification using the result of Section 10.1.2. The Nash 
solution for this case means that for all feasible paths and 



J^{u^*,u^*) > J^{u^,u^*), 



(12.29) 
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and 

> J'‘(u^\u^). (12.30) 

The direct verification involves defining a modified growth function 

1/ N [ 9{x)-q^U‘^x ifx>x^, 

9 (^) = < 

[ 9{x) iix< xl, 

and using the Green’s theorem results of Section 10.1.2. Since > ul by 
assumption, we have g^{x) < 0 for a; > xl. Prom (10.12) with g replaced 
by g^, it can be shown that the new turnpike level for producer 1 is 
min(^^, which defines the optimal pohcy (12.27)-(12.28) for producer 
1. The optimality of (12.26) for producer 2 follows easily. 

To interpret the results of the model, suppose that producer 1 orig- 
inally has sole possession of the fishery, but anticipates a rival entry. 
Producer 1 will switch from his own optimal sustained yield x^* to a 
more intensive exploitation policy -prior to the anticipated entry. 

We can now guess the results in situations involving N producers. 
The fishery will see the progressive elimination of inefficient producers 
as the stock of fish decreases. Only the most efficient producers will 
survive. If, ultimately, two or more maximally efficient producers exist, 
the fishery will converge to a classical bionomic equilibrium, with zero 
sustained economic rent. 

We have now seen that a Nash competitive solution involving N >2 
producers results in the long-run dissipation of economic rents. This 
conclusion depends on the assumption that producers face an infinitely 
elastic supply of aU factors of production going into the fishing effort, but 
typically the methods of licensing entrants to regulated fisheries make 
some attempt also to control the factors of production such as permitting 
the hcensee to operate only a single vessel of specific size. 

In order to develop a model for licensing of fishermen, we let the 
control variable denote the capital stock of the zth producer and let 
the concave function f{v^), with /(O) = 0, denote the fishing mortality 
function, for i = 1,2, N. This requires the replacement of in the 
previous model by The extended model becomes nonhnear in 

control variables. You are asked in Exercise 12.2 to formulate this new 
model and develop necessary conditions for a closed- loop Nash solution 
for this model with N producers. The reader is referred to Clark (1976) 
for further details. 
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For other papers on applications of differential games to fishery 
management, see Hamalainen, Haurie, and Kaitala (1984, 1985) and 
Hamalainen, Ruusunen, and Kaitala (1986, 1990). For applications to 
problems in environmental management, see the edited volume by Car- 
raro and Filar (1995) on the topic. 

Another area in which there have been many apphcations of differ- 
ential games is that of marketing in general and optimal advertising in 
particular. Some references are Bensoussan, Bultez, and Naert (1978), 
Deal, Sethi, and Thompson (1979), Deal (1979), J0rgensen (1982a), Rao 
(1984, 1990), Dockner and J0rgensen (1986, 1992), Chintagunta and Vil- 
cassim (1992), Chintagunta and Jain (1994, 1995), and Pruchter (1999). 
A survey of the literature is done by J0rgensen (1982a) and a monograph 
is written by Erickson (1991). 

For applications of differential games to economics and management 
science in general, see the book by Dockner, J0rgensen, Long, and Sorger 
( 2000 ). 



12.2 Distributed Parameter Systems 



Thus far, our efforts have been directed to the study of the control of 
systems governed by systems of ordinary differential or difference equa- 
tions. Such systems are often called lumped parameter systems. It is 
possible to generalize these to systems in which the state and control 
variables are defined in terms of space as weU as time dimensions. These 
are called distributed parameter systems and are described by a set of 
partial differential or difference equations. 

For example, in the lumped parameter advertising models of the type 
treated in Chapter 7, we need to obtain the optimal advertising expen- 
ditures for each instant of time. However, in the analogous distributed 
parameter advertising model we must obtain the optimal advertising ex- 
penditure at every geographic location of interest at each instant of time; 
see Seidman, Sethi, and Derzko (1987). In other economic problems, the 
spatial coordinates might be income, quality, age, etc. In Section 12.2.2 
we will discuss a cattle-ranching model of Derzko, Sethi, and Thompson 
(1980), in which the spatial dimension measures the age of a cow. 

Let y denote a one dimensional spatial vector, let t denote time, 
and let x(t, y) be a one dimensional state variable. Let w(t, y) denote a 
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control variable, and let the state equation be 

9(t,y,x,^,u) (12.31) 

for t G [0,T] and y € [0, /i]. We denote the region [0, T] x [0, /i] by D, 
and we let its boundary dD be split into two parts F i and F 2 as shown 
in Figure 12.1. The initial conditions will be stated on the part Fi of 
the boundary dD as 

x{0,y) = xo{y) (12.32) 

and 

x{t,0)=v{t). (12.33) 

In Figure 12.1, (12.32) is the initial condition on the vertical portion 
of Fi, whereas (12.33) is that on the horizontal portion of Fi. More 
specifically, in (12.32) the function xo{y) gives the starting distribution 
of x with respect to the spatial coordinate y. The function v{t) in (12.33) 
is an exogenous breeding function at time t of x when y = 0. In the cattle 
ranching example in Section 12.2.2, v{t) measrues the number of newly 
born calves at time t. To be consistent we make the obvious assumption 
that 

x(0,0) -xo(0) = ^;(0). (12.34) 



y 




Figure 12.1: Region D with Boundaries Fi and F 2 

Let F{t, y, x, u) denote the profit rate when x{t, y) = x, u{t, y) = u 
at a point (t, y) in D. Let Q{t) be the value of one unit of x(t, h) at time 
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t and let S{y) be the value of one unit of x(T, y) at time T. Then the 
objective function is: 



max 

u(t,y)eQ 




+ 



Q(t)x{t,h)dt + S(y)x{T,y)dy\ , 



(12.35) 



where O is the set of allowable controls. 

12.2.1 The Distributed Parameter Maximum Principle 

We will formulate, without giving proofs, a procedure for solving the 
problem in (12.31)-(12.2) by a distributed parameter maximum princi- 
ple, which is analogous to the ordinary one. A more complete treat- 
ment of this maximum principle can be found in Sage (1968, Chapter 7), 
Butkowskiy (1969), Lions (1971), Derzko, Sethi, and Thompson (1984), 
and Haurie, Sethi, and Hartl (1984). 

In order to obtain necessary conditions for a maximum, we introduce 
the Hamiltonian 

H = F+\f, (12.36) 

where the spatial adjoint function A(^, y) satisfies 



d\ _ 


dH 


d 

_i 


dH 


d 

_l 


[9i/l 


di 


dx 


dt 


_dxt_ 


&y 





where Xt = dxjdt and Xy = dxfdy. The boimdary conditions on A are 
stated for the F 2 part of the boundary of D (see Figure 12.1) as follows: 

A(L h) = Q(t) (12.38) 

and 

\(T,y) = S(y). (12.39) 

Once again we need a consistency requirement similar to (12.34). It is 

A(T, h) = P{T) = S{h), (12.40) 

which gives the consistency requirement in the sense that the price and 
the salvage value of a unit a:(T, h) must agree. 
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We let u*(t,y) denote the optimal control function. Then the dis- 
tributed parameter maximum principle requires that 

H{t,y,x* ,xl,xl,u* ,\*) > H{t,y,x*,x*t,xl,u,y) (12.41) 

for all [t, y) £ D and aU w € O. 

We have stated only a simple form of the distributed parameter max- 
imum principle which is sufficient for the cattle ranching example dealt 
with in the next section. More general forms of the maximum principle 
are available in the references cited earlier. Among other things, these 
general forms allow for the function F in (12.2) to contain arguments 
such as dxjdy, d^xjdy^, etc. It is also possible to consider controls on 
the boundary. In this case v{t) in (12.33) will become a control variable. 



12.2.2 The Cattle Ranching Problem 

Let t denote time and y denote the age of an animal. Let x{t, y) denote 
the number of cattle of age y on the ranch at time t. Let h be the age 
at maturity at which the cattle are slaughtered. Thus, the set [0, h] is 
the set of all possible ages of the cattle. Let u{t, y) be the rate at which 
y-aged cattle are bought at time t, where we agree that a negative value 
of u denotes a sale. 

To develop the dynamics of the process it is easy to see that 



x{t + At, y) = x{t, y — At) + w(t, ?/) At. (12.42) 



Subtracting x{t, y) from both sides of (12.42), dividing by At, and taking 
the limit as At — > 0, yields the state equation 



dx dx 



(12.43) 



The boundary and consistency conditions for x are given in (12.32)- 
(12.34). Here XQ{y) denotes the initial distribution of cattle at various 
ages, and v{t) is an exogenously specified breeding rate. 

To develop the objective function for the cattle rancher, we let T 
denote the horizon time. Let P{t, y) be the purchase or sale price of a 
t/-aged animal at time t. Let P{t, h) = Q{t) be the slaughter value at 
time t and let P{T,y) = S(y) be the salvage value of a y-aged animal 
at the horizon time T. The functions Q and S represent the proceeds 
of the cattle ranching business. To obtain the profit function we must 
subtract the costs of running the ranch from these proceeds. Let C{y) 
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be the feeding and corralling costs for a y-aged animal per unit of time. 
Let u(t, y) denote the goal level purchase rate of y-aged cattle at time 
t. Any deviation from this goal level is expensive, and the deviation 
penalty cost is given by q[u{t, y) — u(t, y)]^, where g is a constant. Thus, 
the profit maximizing objective function is 

^ lo lo ~ y) + 2 /)] dydt 

+ f Q(t)x{t,h)dt+ f S{y)x{T,y)dy. (12.44) 

7o JQ 

Comparing this with (12.2) we see that 

F{t,y,x,u) = -[q{u-u{t,y)f + C{y)x + P[t,y)u], 

We assume Vt ~ which means that the control variable is imcon- 
st rained. 

To solve the problem we form the Hamiltonian 

H = ~[q{u ~ u{t,y)f C{y)x P P{t,y)u] + (12.45) 

where the adjoint function A(t, y) satisfies 

| = -|+C(.) (12.46) 

subject to the boundary and consistency conditions (12.38)-(12.40). In 
order to maximize the Hamiltonian, we differentiate H with respect to 
u and set it to zero, giving 

u>{t,y) = u{t,y) + - P{t,y)\. (12.47) 

The form of this optimal control is just like that found in the production- 
inventory example of Chapter 6. To compute u*, we must solve for 
A(t,y). It is easy to verify that the general solution of (12.46) is of the 
form 

y) = - [ C{t)(1t + g{t - y), (12.48) 

Jy 

where g is an arbitrary one- variable function and fc is a constant. We 
will use the boundary conditions to determine g and k. 

In order to state the explicit solution for A (and later for a?), we divide 
the region D of Figure 12.2 into the three regions Di, D 2 , and Dz- The 
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y 




Figure 12.2: A Partition of Region D 



45° line from (0,0) to (h,h) belongs to both Di and D 2 , and the 45° 
line from (T — h, 0) to (T, h) belongs to both D 2 and D 3 . Thus, D 2 , 
and Ds are closed sets. The reason why these 45° lines are important 
in the solution for A comes from the fact that the determination of the 
arbitrary function g{t — y) involves the term {t—y). We use the condition 
(12.38) on A(^, h) to obtain \{t,y) in the region Di U D 2 . We substitute 
(12.38) into (12.48) and get 



A(^, h) — — f C{r)dT + g(t — h) = Q{t). 
Jh 



This gives 



g{t -h) = Q{t) + [ C{r)dr, 

Jh 

or 

g(t -y) = Q{t~y + h)+ f C{r)dr. 

Jh 

Substituting for g{t — y) in (12.48) gives us the solution 

\{t,y) = Q{t-y-\-h)- f C(r)dr+ f C{r)dr = Q{t~y-\-h)~ f C{r)d7 

Jy Jh Jy 



in the region Di U D 2 . 
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For region D 3 , we use the condition (12.39) on X{T,y) in (12.48) to 
obtain 

X{T,y) = - f‘c{r)dr + g{T~y) = S{y). 
ly 

This gives 



or 



giT-y) = S(y) + J^Cir)dT, 

g{t -y) = S{T -< + ?/)+ f C{r)dr. 

J T — t-Tv 



Substituting this value of g(t — y) in (12.48) gives the solution 



\{t,y) = S{T-t + 




T-t+y 



C{r)dT 



in region D 3 . 

We can now write the complete solution for X{t^y) as 



K^iV) = < 



Q{t-y + h)~ Jy C{r)dT 
S{T-t + y)-Sy-’-+y'^C{T)dT 



for lt,y) G Di UD 2 , 
for {t,y) G D 3 . 

(12.49) 



Note that X(T^h) = P(T,h) = Q{T) = S{h), The solution for x is 
obtained in a similar manner. We substitute (12.47) into (12.43) and use 
the boundary and consistency conditions (12.32)-(12.34). The complete 
solution is given as 



x{t, y) 



^0(2/ - f) + Jo u*{r,y-t + r)dT 
^ v{t, y) + /(f u*{t-y + T, r)dT 



for {t,y) G Di, 

for (t,y) G I >2 U T>3- 

(12.50) 



We can interpret the solution (12.50) in the region Di as the beginning 
game, which is completely characterized by the initial distribution xq. 
Also the solution (12.49) in region D 3 is the ending game, because in this 
region the animals do not mature, but must be sold at whatever their 
age is at the terminal time T. The first expression in (12.49) and the 
second expression in (12.50) hold in region D 2 , which can be interpreted 
as the middle game portion of the solution. 
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12.2.3 Interpretation of the Adjoint Function 

It is instructive to interpret the solution for A(i, y) in (12.49). An animal 
at age y at time t, where {t, y) is in DiCD^, will mature at time t-y-\-h. 
Its slaughter value at that time is Q{t—y+h). However, the total feeding 
and corralling cost in keeping the animal from its age y until it matures 
is given by jJ* C{r)dT. Thus, A(t, y) represents the net benefit obtained 
from having an animal at age y at time t. You should give a similar 
interpretation for A in region D^. 

Having this interpretation for A, it is easy to interpret the optimal 
control u* in (12.47). Whenever A(i, y) > P(t, y), we buy more than the 
goal level u(t,y), and when X(t,y) < P{t,y), we buy less than the goal 
level. 

Muzicant (1980) considers an extension of the cattle ranching prob- 
lem to allow the breeding rate v{t) to be controlled. Her model is re- 
produced in Feichtinger and Hartl (1986). Other apphcations of the 
distributed parameter control system model are the following problems: 
inventory control incorporating product quality deterioration, see Ben- 
soussan, Nissen, and Tapiero (1975); production and inventory systems, 
see Tzafestas (1982); personnel planning, see Gaimon and Thompson 
(1984b); social services planning, see Haurie, Sethi, and Hartl (1984); 
and consumer durables with age/quality structure, see Robson (1985). 
We believe that many other applications are possible and that they rep- 
resent fruitful areas for research. 

12.3 Impulse Control 

In Chapters 3 and 4 we studied the control of systems governed by or- 
dinary differential equations. In these cases the state variable can only 
change continuously since the control affects only the time derivatives of 
the state variables. In Chapters 5 and 7, we developed some models in 
which the state variable did change instantaneously by a finite amount as 
a result of the application of an impulse control. These situations arose 
when an optimal control policy was bang-bang, but there was no upper 
bound on the control variable. However, there are other kinds of situa^ 
tions in management science and related areas, in which a finite change 
in the value of a state variable is explicitly permitted at some instants 
of time to be determined, and these do not easily lend themselves to the 
impulse control methodology given in Chapters 5 and 7. For that reason 
we shall briefly discuss a more general theory of impulse control that is 
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capable of handling such problems. The state variable dynamics in these 
problems cannot be described by ordinary differential equations. 

Consider the example of an oil producer who pumps oil from a single 
well. The output rate of the well is proportional to the remaining stock 
x(t), so that the output rate is bx(t), where 6 is a constant. The state 
equation is thus, 

X = —bx, x(0) -= 1, (12.51) 

where 1 is the starting stock of a new oil weU. After producing for some 
time, the oil remaining in the weU becomes low so that the output rate 
is too low to be economic to pump. The oil producer then abandons the 
existing well and drills a new one, and repeats this procedure until the 
horizon time T. Let ti, ^iv(T) fhe times at which the new wells 

are drilled; note that N{T) is the total number of wells drilled. The 
graph of the stock of oil available to the producer at times in [OjT'] is 
sketched in Figure 12.3. 



X 




> t 




Figure 12.3: Solution of Equation 12.52 

A convenient way of formally describing this process is to use the 
Dirac delta function defined as 



%) = = 



0 when \y\ > e:, 

1/(2£) when |2/| < e. 



It is clear that 6{y) = 0 for ?/ 0. We can now write the state equation 
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as a modification of (12.51) to account for new wells drilled. Thus, 

N(T) 

x{t) = —hx{t) + ^ — ^*)[l — 2^(0] 5 ~ (12.52) 

i=\ 

where 6{t — ti) is the Dirac delta fimction. This new equation has the 
same solution as that of (12.51) in the interval [0, ^i). We define x{t\) = 
x{t{). The stock of oil just after t\, i.e., a:(tj'‘), is given by 

x{fi) ~ x{tl)-\- j _ 6{t—ti)[l—x{t)]dt~x{ti)-{-[l—x{ti)] = l. (12.53) 

Continuation of this procedure shows that the graph of the solution of 
(12.52) in [0,T] is the one shown in Figure 12.3. 

In equation (12.52), the times U at which the wells were drilled were 
supplied exogeneously. In an optimal impulse control problem, the the- 
ory of which was developed by Bensoussan and Lions (1975), these times 
would be determined in order to maximize a given objective fimction; 
see also Motta and Rampazzo (1996) and Silva and Vinter (1999). In 
the next section we formulate the complete version of the oil driller’s 
problem. 

12.3.1 The Oil Driller’s Problem 

Let v{t) be the drilling control variable, where v{t) G [0, 1]; i;(t) = 0 will 
denote no drilling and v(t) = 1 will denote the drilling of a single well 
having an initial stock of one unit of oil. The other values of v(t) in [0,1] 
are possible but will not occur in this example, because v{t) appears 
linearly in the problem. We can now rewrite (12.52) as 

N(T) 

x{t) = —bx{t) + ]^ <^(^ — — ^(t)]? 3;(0) = 1. (12.54) 

i~l 

The solution of the problem involves the determination of the drilling 
times ti, the magnitude v(ti), and the number N(T) of drillings. Note 
in (12.54) that ii t for all i, then the summation term is zero, since 
6{t — ij) = 0 when t ti. lft = ti, then x{t ^ ) = v{ti), which means that 
we have abandoned the old well and drilled a new weU with stock equal 
to v{ti). A special case of this problem occurs when N(T) is given. This 
case reduces to an ordinary calculus problem; see Exercise 12.8. 
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The objective function of the oil driller is to maximize his profit, 
which is the difference between the revenue from the oil pumped minus 
the cost of drilling new wells. Since the cost of pumping the oil is fixed, 
we ignore it. The objective is to 



maximize 



J = / Pbx{t)dt — 

i=l 



(12.55) 



where P is the unit price of oil and Q is the drilling cost of drilling a 
well having an initial stock of 1. Therefore, P is the total value of the 
entire stock of oil in a new well. See Case (1979) for a re-interpretation 
of this problem as “The Optimal Painting of a Roadside Inn.” 

In the next section we will discuss a maximum principle for impulse 
optimal control problems, based on the work by Blaquiere (1979). The 
solution of the oil driller’s problem appears in Section 12.3.3. 



12,3.2 The Maximum Principle for Impulse Optimal 
Control 



In (2.4) we stated the basic optimal control problem with state variable 
X and an ordinary control variable u. Now we will add an impulse con- 
trol variable v G flv and two associated functions. The first function is 
G(x, v,t), which represents the cost of profit associated with the impulse 
control. The second function is g(x,v^t)^ which represents the instan- 
taneous finite change in the state variable when the impulse control is 
applied. 

With this notation we can state the impulse optimal control problem 
as: 










+ S[x{T)] 



subject to 



N(T) 



x{t) = f(x{t),u{t),t) 4- G[x{ti),v{ti),U], rr(0) = xq, 



i=l 



w G O-a, V G Ot;. 

As before we will have 



(12.56) 



x{tf) = x(ti) + g[x{ti),v{ti),ti] 



(12.57) 
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at the times ti at which impulse control is applied, i.e., g[x{ti), ti] > 

0 . 

Blaquiere (1978) has developed the maximum principle necessary op- 
timality conditions to deal with the problem in (12.56). To state these 
conditions we first define the ordinary Hamiltonian function 

H(x, u, A, t) = F{x, u, t) + A/(cc, u, t) (12.58) 

and the impulse Hamiltonian function 

{x,v,t) — G{x,v,t) \{t^)g{x,v,t). (12.59) 

We now state the impulse maximum principle. Let u*, i = 

1, ..., A^*(T), where U > ti-i > 0, be optimal controls with the asso- 
ciated X* representing an optimal trajectory for (12.56). Then there 
exists an adjoint variable A such that the following conditions hold: 

(i) X* = f(x*,u*,t), t£[0,T], 

(ii) x*(t/) = + g[x’(ti),v*(ti),ti], 

(iii) A = -Ha: [x\u*,X,t], \{T) = So: [x* (T)] , t ^ U, 

(iv) \{ti) = + H^[x*{ti),v*{ti),ti], 

< 

(v) H[x*, u*, X,t] > H[x*,u, \,t] for all u E t / U, 

(vi) H^[x*(ti),v*{ti),ti] > H^[x*{ti),v,ti] for all v € Dy, 

(vii) H[x* {tf ) , {tf ti] + Hi [x" (tf ),v*{tf),ti] = 

H[x*{ti),u’{ti),\(ti),ti] + Hl[x*{ti),v'‘{ti),ti]. 

(12.60) 

If = 0 then the equality sign in (vii) should be replaced by a > sign 
when i ~ 1. Clearly (i) and (ii) are equivalent to the state equation in 
(12.56) with the optimum value substituted for the variables. 

Note that condition (vii) involves the partial derivative of with 
respect to t. Thus, in autonomous problems, where Hf = 0, condition 
(vii) means that the Hamiltonian H is continuous at those times where 
an impulse control is applied. 
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12.3.3 Solution of the Oil Driller’s Problem 

We now give a solution to the oil driller’s problem imder the assumption 
that T is sufficiently small so that no more than one drilling will be found 
to be optimal. We restate below the problem in Section 12.3.1 for this 
case when ti is the drilling time to be determined: 



max I J = J Pbx{t)dt — 
subject to 

x(t) = —bx(t) 4- 6(t — ^i)t?(i)[l — x{t)]^ x(0) = 1, 
0 < v{t) < 1. 



(12.61) 



To apply the maximum principle we define the Hamiltonian functions 
corresponding to (12.58) and (12.59): 

H{x, A) = Pbx + \{-bx) = bx{P - A) (12.62) 



and 

H^(x, v) = —Qv + A(i"*')i?(l ~ x). (12.63) 

The ordinary Hamiltonian (12.62) is without an ordinary control variable 
because we do not have any ordinary control variable in this problem. 

We now apply the necessary conditions of (12.60) to the oil driller’s 
problem: 



x=-bx, (12.64) 

xlf^] = x(ii) +?){«i)[l -x{<i)l, (12.65) 

A = -b{P - A), A(T) = 0, t ^ h, (12.66) 

= (12.67) 

[-<5 + A(tf )(1 -x)]ii*(ii) > [-<5 + A(tf)(l -x)]w fort)€[0,l], (12.68) 
6x(lf )[P - A(i| )] = bx{h)[P - A(ti)]. (12.69) 

The solution of (12.66) for t > ti, where ti is the drilling time, is 

A(i) = P[l-e-‘‘<^-*>], t€(ii,r]. (12.70) 
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From (12.68), the optimal impulse control at ti is 

= bang[0,l;-g+A(t|){l - x{ti)}]. (12.71) 

Note that the optimal impulse control is bang-bang, because the impulse 
Hamiltonian in (12.63) is linear in v. The region in the (t,x)-space 
in which there is certain to be no drilhng is given by the set of all (t, x) 
for which 

-g + A(t+)(l-ar) <0. (12.72) 

After the drilling, A(ti') = P[l — which is given by (12.70). 

Substituting this value into (12.72) gives the condition 

-Q + P[1 - - x) < 0. (12.73) 



The boundary of the no-drilling region is obtained when the < sign 
in (12.73) is replaced by an — sign. This yields the curve 



rc(<i) = 1 



Q 

P(1 - e-Mr-tj)) 



:= 



(12.74) 



shown in Figure 12.4. Note that we did not label the region below the 
curve represented by the complement of (12.73) with v* ~ 1, since (12.73) 
uses the value of A(t), t > t\, and its complement, therefore, does not 







Figure 12.4: Boimdary of No-Drilling Region 
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represent the condition — Q + — x) > 0 prior to drilling. Figure 

12.4 is drawn under the assumption that 

-^(0) = 1 - p[ri^ > 0- (12.75) 

SO that the solution is nontrivial; see Exercises 12.5 and 12.6. In order 
for (12.75) to hold, it is clear that Q must be relatively small, and P, b, 
and/or T must be relatively large. In the contrary situation, (/>(0) < 0 
holds and by Exercise 12.6, no drilling is optimal. 

The bang-bang nature of the optimal impulse control in (12.71) 
means that we have 

v*{ti) = 1. 

Using this in (12.65) and (13.93) we have 

x(tf) = 1 and A(ti) = 0. (12.76) 



From (12.70), 

A(t+)3=P[l-e^(^-^i)]. 

Substituting these results into (12.69) and solving we get 

x{ti) = = ?/;(ti). 



(12.77) 



The curve x — = e is drawn in Figure 12.5. The part 

BC of the 'll) curve lies in the no drilling region, which is above the (f) 







Figure 12.5; Drilling Time 
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curve as indicated in Figure 12.5. The part AB of the -0 curve is shown 
darkened and represents the drilling curve for the problem. The optimal 
state trajectory starts from 2 ;( 0 ) = 1 and decays exponentially at rate h 
until it hits the drilling curve AB at point Y. This intersection point Y 
determines the drilling time ti. In Exercise 12.7 you are asked to show 
that ti — T/2. At time ti, the old weU is abandoned, and a new well 
is drilled making x{tf) = 1. The stock of the new well follows a similar 
decay path as shown. 

The complete solution for the adjoint variable A(t) is obtained from 
(12.70), (12.76), and (12.66) as 



A(t) = { 



P[1 - 



for t G (h,T), 
for t G [0,ti]. 



The drilling threshold ~Q + — x), which is the argument of 

the function in (12.71), has a graph as shown in Figure 12.6. Note that 
is a point of discontinuity of the drilling function. 



-Q + MP)[l-x{t)] 




Figure 12.6; Value of —Q + A(r’')[l — a:(i)] 



Figure 12.5 is drawn on the assumption that T is sufficiently large so 
that the x trajectory hits the drilling curve AB at time t\. 

If T is too small, the x trajectory will not hit AB and no drilling is 
optimal. For 



T = f = --ln(l-J^), 



(12.78) 
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the X trajectory goes through point B of Figure 12.5, which is the point 
at which we are indifferent between drilling and no drilling. 

In Exercise 12.9 you are asked to derive (12.78) and to show that for 
T > T, the value of —Q + A(t^ ) [1 — x(ti)] is positive as shown in Figure 
12.6, so that there will be drilling at ti. 

The oil driller’s problem is a pure impulse optimal control problem as 
it did not involve any ordinary control variable. In the next section, we 
formulate a machine maintenance and replacement problem containing 
both ordinary and impulse control variables. The model, which gen- 
eralizes Thompson’s maintenance model described in Section 9.1.1 and 
which is similar to the Sethi-Morton model described in Section 9.3.1, 
is due to Blaquiere (1979, 1985). For related papers, see Gaimon and 
Thompson (1984a, 1989) and Gaimon (1986b, 1986c). 

12.3.4 Machine Maintenance and Replacement 

In order to define the model, we use the following notation: 

T = the given terminal or horizon time, 
x(t) = the quality of the machine at time t, 0 < a; < 1; a 
higher value of x denotes a better quality, 
u(t) = the ordinary control variable denoting the rate of 
maintenance at time t; 0 <u <U <b/g, 
b = the constant rate at which quality deteriorates in the 
absence of any maintenance, 
g = the maintenance effectiveness coefficient, 

7T = the production rate per unit time per unit quality of 
the machine, 

K = the trade-in value per unit quality, i.e., the old machine 
provides only a credit against the price of the new 
machine and it has no terminal salvage value, 

C = the cost of new machine per unit quality; C > 

t\ — the replacement time; for simplicity we assume at most 
one replacement to be optimal in the given horizon 
time; see Section 12.3.3, 

V — the replacement variable; 0 < ?; < 1; f represents a 
fraction of the old machine replaced by the same frac- 
tion of a new machine. This interpretation will make 
sense because we will show that v is either 0 or 1. 
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With this notation, the impulse optimal control problem is: 



max 





)dt + v{ti)[Kx{ti)~C] 



subject to 

X = -bx-\-gu-{-S{t — [1 — x{t )] , 



0<u<L 



(12.79) 



In writing (12.79) we have assumed that a fraction ^ of a machine with 
quality x has a quality 9x. Furthermore, we note that the solution of the 
state equation will always satisfy 0 < x < 1, because of the assumption 
that u <U < b/g. 



12.3.5 Application of the Impulse Maximum Principle 

To apply the maximum principle, we define the Hamiltonian functions: 



H{x, u,X) = TTX — U + \{~bx “h gu) 

= (7r~A6)x4-(-l + A£?)u (12.80) 

and 

H\x,v) = [Kx-C + X{t-^){l-x)]v, (12.81) 

where A is defined below and ti is the time at which an impulse control is 
applied. We can now state the necessary conditions (12.60) of optimality 
for the machine maintenance and the replacement model: 



X = ~bx~{-gu, (12.82) 

x{t^) = x(ti)+v{ti)[l-x(ti)], (12.83) 

A = -7T + A6, A{T) = 0, ti, (12.84) 

A(ii) = A(«^) + [A:-A(t+)]^(ti), (12.85) 



(tt — A6)x + (— 1 + Ag)tt* > (tt — At)x + (— 1 + As)^, ti € [0, (7], 

( 12 . 86 ) 



{Kx{ti) - C + -x{ti)]}v*{ti) > 



{7fx(ti)-C' + A(i^)[l-x(ii)]}i), v€ [0,1], 



(12.87) 
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and 



wx(tt) - Mt+)bx(tt) + [-1 + A(i+)s]«(t+) = 

TTx(ti) - X(ti)bx(ti) + [-1 + A(ii)g]«*(Ji). 



( 12 . 88 ) 



The solution of (12.84) for < ^ < T is 

A(t) = :^[l-e-'’(^-*)]. (12.89) 

Prom (12.86) the maintenance control is bang-bang: 

u* = bang[0, U\—l-\-Xg]. (12,90) 



The switching point is given by solving — 1 + = 0. Thus, 



^2 — T -h -r In 
0 




(12.91) 



provided the right hand-side is in the interval (^i,T); otherwise set = 
t\. We can graph the optimal maintenance control in the interval (ti, T] 
as in Figure 12.7. Note that this is the optimal maintenance on the new 
machine. To find the optimal maintenance on the old machine, we need 
to obtain the value of \(t) in the interval [0, ti]. This is done later in this 
section, where we obtain a formula for t\ and the optimal maintenance 
policy of the old machine as shown in Figure 12.7 in the interval [0, ^i]. 




Figure 12.7: Optimal Maintenance Policy 
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From (12.87), the optimal replacement control is also bang-bang: 

= bang[0, 1; Kx(ti) — C + X(t^){l — x(ii)}]. (12.92) 

As in Section 12.3.3, we set the argument of (12.92) to zero to obtain 
0(i): 



x(ti) = 



0-K4) 






= 4>{h)- 



To graph 4>{t), we compute 



(12.93) 



■^(0) = 



C — 7t[ 1 — e 
K — 7t[1 — 



and compute the time 



i = T + \ln(l-—), 

0 7T 



which makes (j)(i) =0. 

For simplicity of analysis, we assume 

gK < 1 < gC < g'K/h and £ > 0. 

In Exercise 12.10, you are asked to show that, 

0 < £ < ^2 < 



(12.94) 



(12.95) 



(12.96) 



(12.97) 



We can now graph as shown in Figure 12.8. In plotting Figure 
12.8, we have assumed that 0 < 0(0) < 1. This is certainly the case, if 
T is not too large so that C < ( 7 t / 6)(1 — e“^^). 

As in the oil driller’s problem, we obtain 0(i) by using (12.88). From 
(12.92), we have v*(ti) = 1 and, therefore, A(ti) = K from (12.85) and 
x{tf) = 1 from (12.83). Since gK < 1 from (12.96), we have 

— 1 “(“ pA(^i) — — 1 gK ^ 0 

and, thus, u*{t\) = 0 from (12.90). That is, zero maintenance is optimal 
on the old machine just before it is replaced. Since ti < i < from 
(12.97), we have ^*(^ 1 ”) = U from Figure 12.7. That is, full maintenance 
is optimal on the new machine at the beginning. 
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Substituting all these values in (12.88), we have 



6+ ;_i + ^[1 _ e-KT-*0^]\ [7 

b ^ J I 6' (12.98) 

= ■Kx{ti) — Kbx{ti), 



which yields 



x{ti) = 



Uign 



b)+7r(b — gU)e _ 

b(7T - Kb) 



(12.99) 



The graph of x/j{t) is shown in Figure 12.8. In this figure, AB represents 
the replacement curve. The optimal trajectory x* (t) is shown by CDEFG 
under the assumption that > 0 and ti < t2, where is the intersection 
point of curves (j){t) and ^(t), as shown in Figmre 12.8. 

Figure 12.9 has been drawn for a choice of the problem parameters 
such that t\ =t 2 . In this case we have the option of either replacing the 
machine at ti = ^2? or not replacing it as sketched in the figure. In other 
words, this case is one of indifference between replacing and not replacing 
the machine. If is larger than t 2 , clearly no machine replacement is 
needed. 
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Figure 12.9: The Case = t 2 



To complete the solution of the problem, it is easy to compute A(t) 
in the interval [0, tj], and then use (12.90) to obtain the optimal mainte- 
nance policy for the old machine before replacement. Using A(U) = K 
obtained above and the adjoint equation (12.84), we have 



A(i) = j 



1 _ 



t £ [0, ti\. 



(12.100) 



Using (12.100) in (12.90), we can get u*(t), t G [0, U]. More specifically, 
we can get the switching point by solving — 1 + Ap = 0. Thus, 



= h + -In 



'Kg-bKg_ ' 



( 12 . 101 ) 



If <0, then the policy of no maintenance is optimal in the interval 
[0,ti]. If > 0, the optimal maintenance policy for the old machine is 



u\t) = 



U, 

0 , 






( 12 . 102 ) 



In plotting Figures 12.7 and 12.8, we have assumed that ti > 0. 

With (12.102), one can obtain an expression for x(t), t > ti — tj, in 
terms of tj. Equating this expression to ^(ti) obtained in (12.99), one 
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can write a transcendental equation for t\. In Exercise 12.11, you are 
asked to complete the solution of the problem by obtaining the equation 
for ti . 



EXERCISES FOR CHAPTER 12 



12.1 A bilinear quadratic advertising model (Deal, Sethi, and Thomp- 
son, 1979): Let Xi be the market share of firm i and Ui be its 
advertising rate, i = 1,2. The state equations are 



x\ = hiu\{l — xi — X2) + ei{u\ — U 2 ){x\ X2) — a\X\ 

^i(O) = xio, 

X2 — h2U2{l - X\ - X2) A C2{U2 - U\){xi X2) - a2X2 

^2(0) = 0:20, 

where 6^, e*, and are given positive constants. Firm i wants to 
maximize 

Ji = Wie~P'^Xi{T) + [ {ciXi - 

where Wi^ Ci^ and p are positive constants. Derive the necessary 
conditions for the open-loop Nash solution, and formulate the re- 
sulting boimdary value problem. In a related paper, Deal (1979) 
provides a nmnerical solution to this problem with ei = 62 — 0. 

12.2 Develop the nonlinear model described in the last paragraph of 
Section 12.1.3 by rewriting (12.19) and (12.22) for the model. De- 
rive the adjoint equation for A* for the ith producer, and show that 
the closed- loop Nash policy for producer i is given by 



f{v*) 



d 

(p* — A*)a: 



12.3 Verify that (12.48) is a solution of (12.46). 

12.4 Obtain or verify that (12.50) is the solution of (12.43) with initial 
condition (12.32) and (12.33). 

12.5 Assume 0(0) > 0 as in (12.75). Show that Q < P and 0 < i <T. 
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12.6 If ^(0) < 0, show that no drilling is optimal. 

12.7 Show that the point ti is equal to T/2 in Figure 12.5. 

12.8 Find the optimal drilling time for the oil driller’s problem by using 
ordinary calculus under the assumption that there is exactly one 
drilling time when an oil well of unit initial stock is drilled. 

12.9 Derive (12.78) and show that for T > T, the drilling function is 
strictly positive, i.e., 



-Q + ) [1 - x{t)] > 0. 

12.10 Show that the inequalities in (12.97) follow from the assumptions 
in (12.96). 

12.11 Complete the solution of the machine maintenance and replace- 
ment problem analyzed in Section 12.3.5. Specifically, obtain the 
equation satisfied by replacement time and an expression for 

t € [0,r], in terms of ii. 




Chapter 13 

Stochastic Optimal Control 



In previous chapters we assumed that the state variables of the system 
were known with certainty. If this were not the case, the state of the 
system over time would be a stochastic process. In addition, it might 
not be possible to measure the value of the state variables at time t. In 
this case, one would have to measure functions of the state variables. 
Moreover, the measurements are usually noisy, i.e., they are subject to 
errors. Thus, a decision maker is faced with the problem of making good 
estimates of these state variables from noisy measurements on functions 
of them. 

The process of estimating the values of the state variables is called op- 
timal filtering. In Section 13.1, we will discuss one particular filter, called 
the Kalman filter and its continuous-time analogue called the Kalman- 
Bucy filter. It should be noted that while optimal filtering provides 
optimal estimates of the value of the state variables from noisy measure- 
ments of related quantities, no control is involved. 

When a control is involved, we are faced with a stochastic optimal 
control problem. Here, the state of the system is represented by a con- 
trolled stochastic process. In Section 13.2, we shall formulate a stochas- 
tic optimal control problem which is governed by stochastic differential 
equations. We shaU only consider stochastic differential equations of a 
type known as Ito equations. These equations arise when the state equa- 
tions, such as those we have seen in the previous chapters, are perturbed 
by Markov diffusion processes. Our goal in Section 13.2 will be to syn- 
thesize optimal feedback controls for systems subject to Ito equations in 
a way that maximizes the expected value of a given objective function. 

In Section 13.3, we shall extend the production planning model of 
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Chapter 6 to allow for some uncertain disturbances. We shall obtain an 
optimal production policy for the stochastic production planning prob- 
lem thus formulated. 

In Section 13.4, we solve an optimal stochastic advertising problem 
explicitly. The problem is a modification as well as a stochastic extension 
of the optimal control problem of the Vidale- Wolfe advertising model 
treated in Section 7.2.4. 

In Section 13.5, we wiU introduce investment decisions in the con- 
sumption model of Example 1.3. We will consider both risk-free and 
risky investments. Our goal will be to find optimal consmnption and in- 
vestment policies in order to maximize the discounted value of the utility 
of consumption over time. 

In Section 13.6, we shall conclude the chapter by mentioning other 
types of stochastic optimal control problems that arise in practice. In 
particular, production plaiming problems where production is done by 
machines that are unreliable or failure-prone, can be formulated as 
stochastic optimal control problems involving jump Markov processes. 
Such problems are treated in Sethi and Zhang (1994a, 1994c). Karatzas 
and Shreve (1998) address stochastic optimal control problems in finance 
involving more general stochastic processes including jump processes. 



13.1 The Kalman Filter 

So far in this book, we have assumed that the values of the state variables 
can be measured with certainty. In many cases the assumption that the 
value of a state variable can be directly measured and exactly determined 
may not be realistic. 

There are two types of random distinbances present. The first kind, 
termed measurement noise, arises because of imprecise measurement in- 
struments, inaccurate recording systems, etc. In many cases the mea- 
surement technique involves observations of functions of state variables, 
from which the values of some or all of the state variables are inferred; 
e.g., measuring the inventory of natural gas reservoir involves pressure 
measurements together with physical laws relating pressure and volume. 

The second kind can be termed system noise, in which the system 
itself is subjected to random disturbances. For instance, sales may follow 
a stochastic process, which affects the system equation (6.1) relating in- 
ventory, production, and sales. In the cash balance example, the demand 
for cash as weU as the interest rates in (5.1) and (5.2) can be represented 
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by stochastic processes. 

In analyzing systems, in which one or both of these kinds of noises are 
present, it is important to be able to make good estimates of the values 
of the state variables. In particular, many optimal control problems 
require such estimates in order to determine the optimal controls; see 
Appendix D.1.1. The process of estimating the current values of the state 
variables given the past measurements is called filtering; see Kalman 
(1960a, 1960b), Sorenson (1966), and Bryson and Ho (1969). 

Consider a dynamic stochastic system in discrete time described by 
the difference equation 

= + (13.1) 

or 

+ Gtw^ = (A^ + I)x^ + Gtw^, (13.2) 

where x^ is the u-component state vector, is an m-component system 
noise vector. At is an n x n matrix, and Gt is an n x m matrix. The 
initial state Xq is assmned to be a Gaussian (normal) random variable 
with mean and variance given by 

E[x^] = x^ and £’[(ar® — :r®)(rr® — ^^)^] " Mq. (13.3) 

Also rF is a Gaussian purely random sequence (Joseph and Tou, 1961) 
with 






where 






W^){uF - W^)'^] = QtStr, 


(13.4) 


0 iit^T, 

1 iit = T. 


(13.5) 



Thus, Qt represents the covariance matrix of the random vector and 
and 'uF are independent random variables for t^r. We also assume 
that the sequence is independent of the initial condition x^, i.e., 



E[{w^ - w^){x^ - 5«)] = 0. (13.6) 



The process of measurement of the state variables x^ yields a k- 
dimensional vector which is related to x^ by the transformation 



y* = Mix'- + V*, 



(13.7) 
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where Ht is the state-tomeasurement transformation matrix of dimen- 
sion kxn, and is a Gaussian purely random sequence of measurement 
noise vectors having the following properties: 

E[v^] = 0, E[v^ ■ = RtStr, (13.8) 

E[(w'- - #) • (^)^)^] = 0, E[{xo - x°) ■ = 0. (13.9) 

In (13.8) the matrix Rt is the covariance matrix for the random variable 
The requirements in (13.9) mean that the additive measurement 
noise is independent of the system noise as well as the initial state. 

Given a sequence of observations ..., up to time t, we would 

like to obtain the maximum likelihood estimate of the state rr*, or equiv- 
alently, to find the minimum weighted least squares estimate. In order 
to derive the estimate of we require the use of the Bayes theorem 
and an application of calculus to find the unconstrained minimmn of 
the weighted least squares function. This derivation is straightforward 
but lengthy. It yields the following recursive procedure for finding the 
estimate x^ : 



= x^ + Kt(y* - Htxf) (13.10) 

with 

Kt = PtH'[Rj\ 

The mean x^ and the matrix Pt are calculated 
of 

I* = + + x° given, (13.12) 

Pt = (M-^ + HtR^^Ht)~\ (13.13) 

where 

Mt = (At-i+I)Pt-i(At-i + ir^ + Gt-iQt-iGt-i, Mo given. (13.14) 

The procedure in expressions (13.10)-(13.14) is known as the Kalman 
filter for linear discrete-time processes. 

The interpretation of (13.10) is that the estimate x^ is equal to the 
mean value x plus a correction term which is proportional to the differ- 
ence between the actual measurement y^ and the predicted measurement 
Htx^. It should be noted that 

Pt = E([x>' -x^f], 



(13.11) 

reciusively by means 
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where Pt is the variance of (x* — x^). It therefore is a measure of the 
risk of (x — x^). Thus, the proportionahty matrix Kt can be interpreted 
as the ratio between the transformed risk (PtHt) and the risk Rt in the 
measurement. Because of this property of Kt^ it is called the Kalman 
gain in the engineering hterature. 

To obtain an analogous filter for the continuous-time case, we must 
define a pmely random process to take the place of the Gaussian purely 
random sequence. Such a process will be called a white noise process^ 
which we denote by w{t). This process must be Gaussian at each time 
and w{t) and w{r) must be independent for t^r. We will now show 
how to obtain such a process. 

Consider a Gaussian random process w{t) with E[w{t)] = w{t) and 
the autocorrelation matrix 

E[{w{t) — w{t)){w{T) — u;(t))^] = (13.15) 

where the parameter T is small and q{t) is the covariance matrix 

q{t) = E[{w{t) — w{t)){w{t) — w{t))^]. (13.16) 

Since T is small, it is obvious from (13.15) that the correlation between 
w{t) and w(r) decreases rapidly as t — r increases. The autocorrelation 
function is sketched in Figure 13.1 for a scalar process. 




Figure 13.1: Autocorrelation Fimction for a Scalar Process 
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In order to obtain the white noise process, we let T approach 0, and 
define 

Q{t)= hm 2Tq{t). (13,17) 

Using the function Q{t) we can approximate (13.15) by the expression 

E[{w{t) - w(t))(w{r) - ^^(r))^] Q(t)S(t - r), (13.18) 

where 6{t — r), which is called the Dirac delta function, is the limit as 
£ — )• 0 of 

i O when \t —r\ > e, 

(13.19) 

^ when |t — r| < £. 

It is instructive to compare (13.18) with the corresponding discrete-time 
expression (13.4). The above discussion permits us to write formally the 
continuous analogue of (13.1) as 

dx{t) = A{t)x{t)dt + G{t)w{t)dt. (13.20) 

In order to interpret and solve this equation, a knowledge of stochas- 
tic differential equations is required; see, e.g., Gihman and Skorohod 
(1971), Arnold (1974), Karatzas and Shreve (1997), or 0ksendal (1998). 
Equation (13.20) can be restated as the linear ltd stochastic differential 
equation 

dx(t) = A{t)x{t)dt -f G{t)dz{t), (13.21) 

where z(t) denotes a Wiener process', see Wiener (1949). Comparison of 
(13.20) and (13.21) suggests that w{t) could be considered a “generahzed 
derivative” of the Wiener process. A solution x{t) of an Ito equation is 
a stochastic process, known as a diffusion process. 

We also define the measurement process y{t) in continuous time as 

y{t) = H{t)x{t) + v{t), (13.22) 

where v{t) is a white noise process with 

E[v(t)\ — 0 and E[v{t)v{r)'^\ = R{t)6{t — r). (13.23) 

Using the theory of stochastic differential equations and calculus of max- 
ima and minima, we can obtain a filter which provides the estimate x{t) 
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given the observations y{s), s G [0,t], as follows: 



X 


= Ax + Gw + K[y — Hx], £(0) = iJ[a;(0)], 


(13.24) 


K 




(13.25) 


P 


= AP+PA^ -KHP + GQG^, 


(13.26) 


P(0) 


= £[rr(0)x^(0)]. 


(13.27) 



This is called the Kalman- Bucy filter (Kalman and Bucy, 1961) for linear 
systems in continuous time. It is analogous to the Kalman filter for the 
discrete-time case. Equation (13.26) is called the matrix Riccati equa^ 
tion; see Appendix D. Besides engineering applications, the Kalman filter 
and its extensions are very useful in econometric modeling; see Buchanan 
and Norton (1971), Chow (1975), Aoki (1976), and Naik, Mantrala, and 
Sawyer (1998). 

13.2 Stochastic Optimal Control 

In Section 13.1 on the Kalman filter, we obtained optimal state esti- 
mation for linear systems with noise and noisy measurements. We also 
defined the white noise process for obtaining the Kalman-Bucy filter for 
continuous-time systems. In Appendix D.1.1, we note that for stochas- 
tic hnear-quadratic optimal control problems, the separation principle 
allows us to solve the problem in two steps: to obtain the optimal esti- 
mate of the state and to use it in the optimal feedback control formula 
for deterministic linear-quadratic problems. 

In this section we shah introduce the possibility of controlling a sys- 
tem governed by Ito stochastic differential equations. In other words, 
we shall introduce control variables into a nonlinear version of equation 
(13.21). This produces the formulation of a stochastic optimal control 
problem. 

It should be noted that for such problems, the separation princi- 
ple does not hold in general. Therefore, to simplify the treatment, it 
is often assumed that the state variables are observable, in the sense 
that they can be directly measured. Furthermore, most of the liter- 
ature on these problems use dynamic programming or the Hamilton- 
Jacobi-Belhnan framework rather than stochastic maximiun principles. 
In what follows, therefore, we shall formulate the stochastic optimal con- 
trol problem under consideration, and provide a brief, informal develop- 
ment of the Hamilton-Jacobi-Bellman (HJB) equation for the problem. 
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A detailed analysis of the problem is available in Fleming and Rishel 
(1975). For problems involving jmnp disturbances, see Davis (1993) for 
the methodology of optimal control of piecewise deterministic processes. 
For stochastic optimal control in discrete time, see Bertsekas and Shreve 
(1996). 

Let us consider the problem of maximizing 

E[£ F{Xt, Ut, t)dt + S{Xt, T)], (13.28) 

where Xt is the state variable, Ut is the closed-loop control variable, zt 
is a standard Wiener process, and together they are required to satisfy 
the Ito stochastic differential equation 



dXt = f{Xu Ut, t)dt G{Xt, Ut, t)dzt, Xo = XQ. (13.29) 

For convenience in exposition we assume F : , S : 

E^ xE^ E\ f: E^ xE^ xE^ E^ andG: E^ x E^ x E^ ^ E\ 
so that (13.29) is a scalar equation. We also assume that the fimctions 
F and S are continuous in their arguments and the functions / and G 
are continuously differentiable in their arguments. For multidimensional 
extensions of this problem, see Fleming and Rishel (1975). 

Since (13.29) is a scalar equation, the subscript t here means only 
time t. Thus, writing Xt, in place of writing X(t), wiU not cause any 
confusion and, at the same time, wiU eliminate the need of writing many 
parentheses. Thus, dzt in (13.29) is the same as dz(t) in (13.21), except 
that in (13.29), dzt is a scalar. 

To solve the problem defined by (13.28) and (13.29), let V{x,t), 
known as the value function, be the expected value of the objective 
function (13.28) from t to T, when an optimal pohcy is followed from t 
to T, given Xt = x. Then, by the principle of optimality, 

V{x,t) = max E[F{x, u,t)dt F V[x -\- dXt,t -H dt)]. (13.30) 

By Taylor’s expansion, we have 

V{x + dXt, t + dt) = V{x, t) +Vtdt + V^dXt + ^V^xidXt)^ 

+lVtt(dt)^ + \V^tdXtdl (13.31) 

+higher-order terms. 
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Prom (13.29), we can formally write 

(dXt)^ = f{dtf + G\dztf + 2fGdztdt, (13.32) 

dXtdt = f{dt)^ + Gdzfdt. (13.33) 

The exact meaning of these expressions comes from the theory of 
stochastic calculus; see Arnold (1974, Chapter 5), Durrett (1996) or 
Karatzas and Shreve (1997). For our purposes, it is sufficient to know 
the multipUcation rules of the stochastic calculus: 

{dzt)‘^ — dt, dzt.dt = 0, dt^ = 0. (13.34) 

Substitute (13.31) into (13.30) and use (13.32), (13.33), and (13.34) to 
obtain 

V = m^E Fdt + V + Vtdt + V,fdXt + ^Va:a:G‘^dt + o(dt) . (13.35) 

Note that we have suppressed the arguments of the fimctions involved 
in (13.35). 

Canceling the the term V on both sides of (13.35), dividing the re- 
mainder by dt, and letting dt — > 0, we obtain the Hamilton- Jacobi- 
Belhnan equation 

0 = m^[F + l/t + F*/ + ivia:G2l (13.36) 

for the value frmction V(t,x) with the boundary condition 

V{x,T)=^S{x,T). (13.37) 

In the next section we shall apply this theory of stochastic optimal 
control to a simple stochastic production inventory problem treated by 
Sethi and Thompson (1981). 

13.3 A Stochastic Production Planning Model 

Consider a factory producing a homogeneous good and having an inven- 
tory warehouse. Define the following quantities: 

Xt — the inventory level at time t (state variable) , 

Ut — the production rate at time t (control variable). 
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S — the constant demand rate at time t; S > 0, 

T = the length of planning period, 

X = the factory-optimal inventory level, 
u = the factory-optimal production level, 
xq = the initial inventory level, 
h — the inventory holding cost coefficient, 
c = the production cost coefficient, 

B = the salvage value per unit of inventory at time T, 
zt = the standard Wiener process, 
a = the constant diffusion coefficient. 

We now state the conditions of the model. The first condition is the 
stock-flow equation stated as the ltd stochastic differential equation 

dXt == (Ut - S)dt ^ adzt, Xo = xo, (13.38) 

where xq denotes the initial inventory level. As in (13.20) and (13.21), 
we note that the process dzt can be formally expressed as w{t)dt, where 
w(t) is considered to be a white noise process; see Arnold (1974). It can 
be interpreted as “sales returns,” “inventory spoilage,” etc., which are 
random in nature. The second is the objective function: 

minE I - m)^ + h(Xt - xf]dt + BXt^ . (13.39) 

Note that we do not restrict the production rate to be nonnegative as 
required in Chapter 6. In other words, we permit disposal (i.e., Ut <0). 
While this is done for mathematical expedience, we will state conditions 
under which a disposal is not required. Note further that the inventory 
level is allowed to be negative, i.e., we permit backlogging of demand. 

The solution of the above model will be carried out via the previous 
development of the Hamilton-Jacobi equation satisfied by a certain value 
function. To simphfy the mathematics, we assume that 

X = u = G and h = c= 1. (13.40) 

This assumption results in no loss of generality as the following analysis 
can be extended in a parallel manner for the case without (13.40). With 
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(13.40), we restate (13.39) as 



maxis' 




-(Uf ■^Xf)dt-\-BXT \ 



(13.41) 



Let V{x,t) denote the expected value of the objective function from time 
t to the horizon T with Xt = x and using the optimal policy from t to T. 
The function V (x, t) is referred to as the value fimction, and it satisfies 
the Hamilton- Jacobi- Bellman (HJB) equation 

0 = max[~(u^ + x‘^) + Vt + Vx(u - S) + ^cr'^Vxx] (13.42) 

u J, 

with the boimdary condition 



V{x,t)^Bx. (13.43) 

Note that these are applications of (13.36) and (13.37) to the production 
planning problem. 

It is now possible to maximize the expression inside the bracket of 
(13.28) with respect to u by taking its derivative with respect to u and 
setting it to zero. This procedure yields 

= (13.44) 

Substituting (13.44) into (13.42) yields the equation 

Q = lf-x'^ + Vt-SV^ + ^<7V^^, (13.45) 

known as the Hamilton- Jacobi equation. This is a partial differential 
equation which must be satisfied by the value fimction V (x, t) with the 
boundary condition (13.43). The solution of (13.45) is considered in the 
next section. 



Remark 13.1 It is important to remark that if production rate were 
restricted to be nonnegative, then (13.44) would be changed to 



u{x^ t) = max 



0 , 



Vx{x,ty 

2 



(13.46) 



Substituting (13.46) into (13.43) would give us a partial differential equa- 
tion which must be solved niunerically. We shall not consider (13.46) 
further in this chapter. 
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13.3.1 Solution for the Production Planning Problem 

To solve equation (13.45) we let 

V{x,t) = Q{t)x^ Rx M{t). (13.47) 

Then, 



Vt 


~ Qx‘^ Y Rx + M, 


(13.48) 




~ 2Qx + i?. 


(13.49) 


^xx 


- 2(5, 


(13.50) 



where Y denotes dYjdt. Substituting (13.48) in (13.45) and collecting 
terms gives 

f>2 

x^[Q + Q'^ -l\+x[R + RQ- 2SQ] +M + --RS + a^Q = 0. (13.51) 

Since (13.51) must hold for any value of or, we must have 

Q = l-Q^ <5(T) = 0, (13.52) 

R = 2SQ-RQ, R{T) = B, (13.53) 

d2 

M = RS- — -a‘^Q, M(T) = 0, (13.54) 

where the boundary conditions for the system of simultaneous differential 
equations (13.52), (13.53), and (13.54) are obtained by comparing (13.47) 
with the boimdary condition B{x,T) = Bx of (13.43). 

To solve (13.52), we expand Q/(l — Q^) by partial fractions to obtain 



Q 



1 1 
+ 



2 [1-Q 1 + Qj 

which can be easily integrated. The answer is 



= 1 , 



where 



J/ + 1 


(13.55) 




(13.56) 



Since S is assumed to be a constant, we can reduce (13.53) to 

= 0, l?“(r) =B-2S 
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by the change of variable defined by = R — 2S. Clearly the solution 
is given by 

log lf(T) - log H°(t) = -j^ Q{T)dT, 
which can be simplified farther to obtain 



R = 2S + 



2(B-2S)^ 
y + i 



(13.57) 



Having obtained solutions for R and Q, we can easily express (13.54) as 



M{t) = - j\RS - R^/i - a'^Q]dt. (13.58) 

The optimal control is defined by (13.44), and the use of (13.55) and 
(13.57) yields 



«* = Vi/2 = Qx + R/2==S + 



{y-l)x + {B-2S)^ 

y + 1 



(13.59) 



Remark 13.2 The optimal production rate in (13.59) equals the de- 
mand rate plus a correction term which depends on the level of inven- 
tory and the distance from the horizon time T. Since (y — 1) < 0 for 
^ < T, it is clear that for lower values of x, the optimal production rate 
is likely to be positive. However, if x is very high, the correction term 
will become smaller than —S', and the optimal control will be negative. 
In other words, if inventory level is too high, the factory can save money 
by disposing a part of the inventory resulting in lower holding costs. 



Remark 13.3 If the demand rate S were time-dependent, it would have 
changed the solution of (13.53). Having computed this new solution 
in place of (13.57), we can once again obtain the optimal control as 
u* ^Qx + R/2. 

Remark 13.4 Note that when T — >■ oo, we have y 0 and 

u* -^S-x, (13.60) 

but the undiscoimted objective function value (13.41) in this case be- 
comes — oo. Clearly, any other policy will render the objective function 
value to be — oo. In a sense, the optimal control problem becomes ill- 
posed. One way to get out of this difficulty is to impose a nonzero 
discoimt rate. This is carried out in Sethi and Thompson (1980). 
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Remark 13.5 It would help our intuition if we could draw a picture 
of the path of the inventory level over time. Since the inventory level 
is a stochastic process, we can only draw a typical sample path. Such 
a sample path is shown in Figure 13.2. If the horizon time T is long 
enough, the optimal control will bring the inventory level to the goal 
level x = 0. It will then hover around this level until t is sufficiently close 
to the horizon T. During the ending phase, the optimal control will try 
to build up the inventory level in response to a positive valuation B for 
ending inventory. 




Figure 13.2: A Sample Path of Xt with Xo = > 0 and B > 0 



13.4 A Stochastic Advertising Problem 

In this section, we shall discuss a stochastic advertising model due to 
Sethi (1983b). The model is : 

max E [f e~^\-KXt - U^)dt^ 
subject to 

< (13.61) 

dXt = {rUt\/l - Xt - 6Xt)dt + a(Xt)dzt, Xq = 2 : 0 , 

Ut>0, 

V 



where Xt is the market share and Ut is the rate of advertising at time t, 
and where the other parameters are as specified in Section 7.2.1. Note 
that the term in the integrand represents the discounted profit rate at 
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time t. Thus, the term in the square bracket represents the total dis- 
counted profits on a sample path. The objective in (13.61) is, therefore, 
to maximize the expected value of the total discounted profits. 

This model is a modification as weU as a stochastic extension of 
the optimal control formulation of the Vidale- Wolfe advertising model 
presented in (7.39). The Ito equation in (13.61) modifies the Vidale- 
Wolfe dynamics (7.22) by replacing the term ru{l—x) by rUty/l — Xt and 
adding a diffusion term cr{Xt)dzt on the right-hand side. Furthermore, 
we replace the linear cost of advertising u in (7.39) by a quadratic cost of 
advertising in (13.61). We also relax the control constraint 0 <u <Q 
in (7.39) to simplify Ut > 0. The addition of the diffusion term yields a 
stochastic optimal control problem as expressed in (13.61). 

An important consideration in choosing the fimction (t{x) should be 
that the solution Xt to the Ito equation in (13.61) remains inside the 
interval [0, 1]. Merely requiring that the initial condition xq € [0, 1], as 
in Section 7.2.1, is no longer sufficient in the stochastic case. Additional 
conditions need to be imposed. It is possible to specify these condi- 
tions by using the theory presented by Gihman and Skorohod (1972) for 
stochastic differential equations on a finite spatial interval. In our case, 
the conditions boil down to the following, in addition to xq E (0,1), 
which has been assumed already in (13.61); 

cr{x) > 0, a; G (0, l) and cr(0) = cr(l) = 0. (13.62) 

It is possible to show that for any feedback control u{x) satisfying 

u{x) > 0, 2 : G (0, 1], and ii(0) > 0, (13.63) 



the Ito equation in (13.61) will have a solution Xt such that 0 < < 1, 

almost surely (i.e., with probabihty 1). Since our solution for the optimal 
advertising u*{x) would turn out to satisfy (13.63), we will have the 
optimal market share X^ lie in the interval (0, 1). 

Let V{x) denote the value function for the problem, i.e., V{x) is the 
expected value of the discounted profits from time t to infinity. When 
Xt ~ X and an optimal policy is followed from time t onwards. Note 
that since T = 00 , the future looks the same from any time t, and 
therefore the value function does not depend on t. It is for this reason 
we have defined the value function as V(x), rather than V(x^t) as in the 
previous section. 

Using now the principle of optimahty as in Section 13.2, we can write 
the HJB equation as 



pV{x) = m.^ iTx — u^ -\- Vx{ruy/1 — x ~ 6x) Vxx<x^{x) . (13.64) 
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Maximization of the RHS of (13.64) can be accomplished by taking its 
derivative with respect to u and setting it to zero. This gives 



l(x) = 



rVx\/l — X 



(13.65) 



Substituting of (13.65) in (13.64) and simplifying the resulting expression 
yields the HJB equation 



pV (x) = 7TX+ ^ - VxSx + 

As shown in Sethi (1983b), a solution of (13.66) is 



(13.66) 



V{x) = AiC + — ; — , 
4p 



(13.67) 



where 

;; ^{p + Sy^+r‘^7r-{p + S) 

^ 7T2 ’ ^ ’ 

as derived in Exercise 7.40. In Exercise 13.4, you are asked verify that 
(13.67) and (13.68) solve the HJB equation (13.66). 

We can now obtain the explicit formula for the optimal feedback 
control as 

«*(x) = . (13.69) 

Note that u*(x) satisfies the conditions in (13.63). 

As in Exercise 7.40, it is easy to characterize (13.69) as 



> u if Xt < x^ 
u; = u*iXt) = l =.u ifXt = x, 
<u if Xt> x^ 



(13.70) 



r‘^X/2 

r2A/2 + 5 



(13.71) 



rAvi — X 



(13.72) 



as given in (7.48). 
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The market share trajectory for Xt is no longer monotone because 
of the random variations caused by the diffusion term cr(Xt)dzt in the 
Ito equation in (13.61). Eventually, however, the market share process 
hovers aroimd the equilibrium level x. It is, in this sense and as in the 
previous section, also a turnpike result in a stochastic environment. 



13.5 An Optimal Consumption-Investment 
Problem 

In Example 1.3 in Chapter 1, we had formulated a problem faced by Rich 
Rentier who wants to consume his wealth in a way that wiU maximize his 
total utihty of consumption and bequest. In that example, Rich Rentier 
kept his money in a savings plan earning interest at a fixed rate of r > 0. 

In this section we shall offer Rich, a possibility of investing a part 
of his wealth in a risky secmity or stock that earns an expected rate of 
return that equals a > r. The problem of Rich, known now as Rich 
Investor, is to optimally allocate his wealth between the risk-free savings 
account and the risky stock over time and consiune over time so as to 
maximize his total utility of consumption. We shall assmne an infinite 
horizon problem in heu of the bequest, for convenience in exposition. One 
could, however, argue that Rich’s bequest would be optimally invested 
and consumed by his heir, who in tm:n would leave a bequest that would 
be optimally invested and consumed by a succeeding heir and so on. 
Thus, if Rich considers the utility accrued to all his heirs as his own, then 
he can justify solving an infinite horizon problem without a bequest. 

In order to formulate the stochastic optimal control problem of Rich 
Investor, we must first model his investments. The savings account is 
easy to model. If Sq is initial price of a unit of investment in the savings 
account earning an interest at the rate r > 0, then we can write the 
accumulated amount St at time t as 

= Soe^^. 

This can be expressed as a differential equation, dSt/dt = rSt, which we 
shall rewrite as 

dSt = rStdt^ So given. (13.73) 

Modeling the stock is much more complicated. Merton (1971) and 
Black and Scholes (1973) have proposed that the stock price Pt can be 
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modeled by an Ito equation, namely, 



or simply, 



dPt 

—— = adt + (rdzt, Pq given, 



dPt = aPtdt + o-Ptdzt^ Pq given, 



(13.74) 

(13.75) 



where a is the average rate of return on stock, a is the standard deviation 
associated with the return, and zt is a standard Wiener process. 



Remark 13.6 The LHS in (13.74) can be written also as dlnPt. Another 
name for the process zt is Brownian Motion. Because of these, the price 
process Pt given by (13.74) is often referred to as a logarithmic Brownian 
Motion. 



In order to complete the formulation of Rich’s stochastic optimal 
control problem, we need the following additional notation; 



Wt 

Q 

Qt 

i-Qt 



U{c) 



p 

B 



= the wealth at time t, 

= the consumption rate at time i, 

= the fraction of the wealth invested in stock at time t, 
= the fraction of the wealth kept in the savings account 
at time i, 

= the utility of consumption when consumption is at the 
rate c; the function U[c) is assumed to be increasing 
and concave, 

= the rate of discount applied to consumption utility, 

= the bankruptcy parameter to be explained later. 



Next we develop the dynamics of the wealth process. Since the in- 
vestment decision Q is rmconstrained, it means Rich is allowed to buy 
stock as well as to sell it short. Moreover, Rich can deposit in, as well 
as borrow money from, the savings account at the rate r. 

While it is possible to obtain rigorously the equation for the wealth 
process involving an intermediate variable, namely, the number Nt of 
shares of stock owned at time t, we shall not do so. Instead, we shall 
write the wealth equation informally as 

dWt = QtWtadt + QtWtadzt + {l-Qt)rWtdt-Ctdt 

= {a- r)QtWtdt + (rWt - Ct)dt + crQtWtdzt, Wq given, 

(13.76) 
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and provide an intuitive explanation for it. The term QtWtOidt represents 
the expected return from the risky investment of QtWt dollars during the 
period from t to The term QtWtadzt represents the risk involved in 

investing QtW^ dollars in stock. The term (1 — Qt)rWtdt is the amotmt 
of interest earned on the balance of (1 — Qt)Wt dollars in the savings 
account. Finally, Ctdt represent the amount of consumption during the 
interval from t tot -\-dt. 

In deriving (13.76), we have assumed that Rich can trade continously 
in time without incurring any broker’s commission. Thus, the change in 
wealth dWt from time t to time t + dt is due only to capital gains from 
change in share price and to consumption. For a rigorous development 
of (13.76) from (13.73) and (13.74), see Harrison and Phska (1981). 

Since Rich can borrow an imhmited account and invest it in stock, 
his wealth could fall to zero at some time T. We shall say that Rich goes 
bankrupt at time T, when his wealth falls zero at that time. It is clear 
that T is a random variable. It is, however, a special type of random 
variable, called a stopping time, since it is observed exactly at the instant 
of time when wealth falls to zero. 

We can now specify Rich’s objective function. It is: 



max 



{ I pT 

\j = E / e-<’*-U(Ct)dt + e-<’'^B 
[ [Jo 



(13.77) 



where we have assumed that Rich experiences a payoff of H, in the units 
of utility, at the time of bankruptcy. B can be positive if there is a 
social welfare system in place, or B can be negative if there is remorse 
associated with bankruptcy. See Sethi (1997a) for a detailed discussion 
of the bankruptcy parameter B. 

Let us recapitulate the optimal control problem of Rich Investor: 

J = e\^ e-f^U{Ct)dt + 

I subject to 

dWt = {a- r)QtWtdt + {rWt - Ct)dt + uQtWfdt, Wq given, 

C( >0. 

(13.78) 

As in the infinite horizon problem of Section 13.3, here also the value 
function is stationary with respect to time t. This is because T is a 



max 
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stopping time of bankruptcy, and the future evolution of wealth, invest- 
ment, and consumption processes from any starting time t depends only 
on the wealth at time t and not on time t itself. Therefore, let V{x) 
be the value function associated with an optimal policy beginning with 
wealth Wt = X at time t. Using the principle of optimality as in Section 
13.2, the HJB equation satisfied by the value function V{x) for problem 
(13.78) can be written as 
✓ 

pVix) = max [(a — r)qxVx + (rx — c)Vp 

I +(1/2)9WK^ + C/(c)1, (13-79) 



[ 1 ^( 0 ) = B. 

This problem and number of its generalizations can be solved explicitly; 
see Sethi (1997a). 

For the purpose of this section, we shall simplify the problem by 
making further assumptions. Let 

U(c) = Inc, (13.80) 

as was used in Example 1.3. This utility has an important simplifying 
property, namely, 

U'(0) = l/c|e:=o = oo. (13.81) 

We also assume B = — oo. See Sethi (1997a, Chapter 2) for solutions 
when B > — oo. 

Under these assumptions. Rich would be sufficiently conservative 
in his investments so that he does not go bankrupt. This is because 
bankruptcy at time t means Wt = 0, implying “zero consumption” there- 
after, and a small amount of wealth would allow Rich to have nonzero 
consumption resulting in a proportionally large amount of utility on 
account of (13.81). While we have provided an intuitive explanation, 
it is possible to show rigorously that condition (13.81) together with 
B = —oo implies a strictly positive consmnption level at all times and 
no bankruptcy. 

Since Q is already imconstrained, having no bankruptcy and only 
positive (i.e., interior) consumption level allows us to obtain the form of 
the optimal consumption and investment policy simply by differentiating 
the RHS of (13.79) with respect to g and c and equating the resulting 
expressions to zero. Thus, 

(a - r)rrUc -f q(r‘^x‘^Vxx = 0, 
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i.e., 



and 



q(x) = 



(a-r)K 

g<^^Vxx 




(13.82) 

(13.83) 



Substituting (13.82) and (13.83) in (13.79) allows us to remove the 
max operator from (13.79), and provides us with the equation 



pV(x) = + (rx - l)Kx + Wx, (13.84) 

^XX 

where 

7 = ^. (13.85) 

This is a nonlinear ordinary differential equation that appears to be 
quite difficult to solve. However, Karatzas, Lehoczky, Sethi, and Shreve 
(1986) used a change of variable that transforms (13.84) into a second- 
order, linear, ordinary differential equation. They assumed that the value 
function is strictly concave and, therefore, is monotonically decreasing 
in X. This means that the function c(-) defined in (13.84) has an inverse 
X(-) such that (13.84) can be rewritten as 



pV(X{c)) = + (rX(c) - c)U'(c) + U(c), 



(13.86) 



where ' and " denote, respectively, the first and second derivatives of 
functions with respect to their arguments. Differentiation with respect 
to c yields the intended second-order, hnear ordinary differential equation 



7X"(c) = 



(r-p- 27) 



V'(c) 

U'(c) 



+ 



U"{c) 

U"{c) 



X'{c) 

■]2 

{rX{c) - c). 



(13.87) 



This equation has an explicit solution with three parameters to be de- 
termined; see Appendix A. After some calculations, one can determine 
these parameters, and obtain the solution of (13.84) as 

V(a:) = -\n{px) -\- - — ■ 

P P 



X > 0. 



(13.88) 
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In Exercise 13.3, you are asked by a direct substitution in (13.84) 
to verify that (13.88) is indeed a solution of (13.84). Moreover, V{x) 
defined in (13.88) is strictly concave, so that our concavity assumption 
made earlier is justified. 

From (13.88), it is easy to show that (13.82) and (13.83) yield the 
following feedback policies: 

q\x) = (13.89) 

c*{x) = px. (13.90) 

The investment policy (13.89) says that the optimal fraction of the wealth 
invested in the risky stock is (a — 'r)/cr^, i.e., 

Q*t=q*{Wt) = ^,t>0, (13.91) 

which is a constant over time. The optimal consumption policy is to 
consume a constant fraction p of the current wealth, i.e., 

C*^c*{Wt)^pWut>tl. (13.92) 

This problem and its many extensions have been studied in great 
detail. See, e.g., Sethi (1997a). 



13.6 Concluding Remarks 

In this chapter, we have considered stochastic optimal control problems 
subject to ltd differential equations. For impulse stochastic control, see 
Bensoussan and Lions (1984). For stochastic control problems with jump 
Markov processes or, more generally, martingale problems, see Fleming 
and Soner (1992), Davis (1993), and Karatzas and Shreve (1998). 

For applications of stochastic optimal control to manufacturing prob- 
lems, see Sethi and Zhang (1994a) and Yin and Zhang (1997). For ap- 
plications to problems in finance, see Sethi (1997a) and Karatzas and 
Shreve (1998). For applications in marketing, see Tapiero (1988), Ra- 
man (1990), and Sethi and Zhang (1995). For applications of stochastic 
optimal control to economics including economics of natural resources, 
see, e.g., Pindyck (1978a, 1978b), Rausser and Hochman (1979), Arrow 
and Chang (1980), Derzko and Sethi (1981a), Bensoussan and Lesorune 
(1980, 1981), Malliaris and Brock (1982), and Brekke and 0ksendal 
(1994). 
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13.1 Consider the discrete-time dynamics 



= ax^ + u?*, 



(13.93) 



where and are Gaussian purely random sequences with 
E[w^] = E[v^] = 0, E[w^w'^\ = qStr, 

Elv^v'^] = rStr, 

where h, and r are constants. The initial condition x^ is a 
Gaussian random variable with mean x^ and variance mo, and let 
po = rmo/{r + moh^). Show that the Kalman filter is given by 

x*~^^ — x^ = ax^ 4- ^ ^ + 1)^^) 

r 



r[{a-\-l)^pt-i+q] 

Pt = , ,or^ . i\2 TT’ Po above. 

r + (a-f l)^Pi_i ^q] 



13.2 Consider the continuous-time dynamics 



' 

X = 

< 

y =2x + v, 

\ 



(13.94) 



where w and v are white noise processes with 

£^[t(;(t)] = E\v{t)] — 0, E[w{t)w(r)] — qS{t — r), 
E[v{t)v{r) = r6{t — r), 

where q and r are constants. The initial condition x(0) is a Gaus- 
sian random variable with mean 0 and variance po- Show that the 
Kalman-Bucy filter is given by 

X z=z ?E\y — 2:r], x(0) = 0 
r 

and 

P = — ;r + p(^) "" Po- 
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13.3 Verify by direct substitution that the value function defined by 
(13.67) and (13.68) solves the HJB equation (13.66). 

13.4 Verify by direct substitution that the value function in (13.88) 
solves the HJB equation (13.84). 

13.5 Solve the consumption-investment problem (13.78) with the utility 
function 

U{c) = cP, 0</3< 1, 

and B = t). 

13.6 Solve Exercise 13.5 when /? < 0 and B = — oo. 




Appendix A 

Solutions of Lineeir 
Differential Equations 



A.l Linear Differential Equations with 
Constant Coefficients 

Linear differential equations with constant coefficients are usually writ- 
ten as 

-h ... -h + a-ay = g, (A.l) 

where a^, fe = 1, ..., n, are numbers, y^^^ — ^ , and g = g(t) is a known 

function of t. We shall denote by D = ^ the derivative operator^ so that 
the differential equation now becomes 

p{D)y = (£»" + + ... + an-iD + a„)j/ = g. (A.2) 

If g{t) = 0, the equation is said to be homogeneous. If g{t) ^ 0, then the 

homogeneous or reduced equation is obtained from (A.2) by replacing g 
by 0. 

If y and y* are two different solutions of (A.2), then it is easy to 
show that y~y* solves the reduced equation of (A.2). Hence, if y is any 
solution to (A.2), it can be written as 

y = y* + (A.3) 

where y* is any other particular solution to (A.2) and y^ is a suitable 
solution to the homogeneous equation. Therefore, solving (A.2) involves 
(a) finding all the solutions to the homogeneous equation, called the gen- 
eral solution, and (b) finding a particular solution to the given equation. 
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The rest of these notes indicate how to solve these two problems. 

Given (A.l) the auxiliary equation is 

p{m) = wT' + + ... + an-im + an = 0. (A.4) 

In other words, p{m) is obtained from p{D) by replacing by m. The 
auxiliary equation is an ordinary polynomial of nth degree and has n real 
or complex roots, counting multiple roots according to their multiplicity. 
We wiU see that, given these roots, we can write the general solution 
forms of homogeneous linear differential equations. 



A. 2 Homogeneous Equations of Order One 

Here the equation is 

{D - a)y = y' - ay = 0, 
which has y = as its general solution form. 

A. 3 Homogeneous Equations of Order Two 

Here the differential equation can be factored (using the quadratic for- 
mula) as 

{D - mi){D - m 2 )y = 0, 

where mi and m 2 can be real or complex. Examples are given in Table 
A.l and the solution forms are given in Table A.2. 



Differential Equation 


General Solution Form 


1 . -4y'-^4y = Q 


y{t) =e 2 «(Ci+ 6 ' 2 t) 


2 . 2 /" - 42/' + 322 - 0 


y{t) =Cie^^ 4 -C 2 e^ 




= e^^{Di sinh t) + D 2 sinh t) 


3. y" - 42 /' + 5^ = 0 


y{t) =e 2 ‘(Cie« + C 2 e-") 




= e^^(Di sin t -\- D 2 cos t) 



Table A.l: Examples of Homogeneous Equations of Order Two 
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Root 


General Solution Form 


7^ ^2j real 


y{t) = + C 2 e'" 2 ‘ 


i 


= e“*(CieM + C2e-“) 




or 


mi = a + 5, m 2 ~ a — b 


y{t) — e“‘(Ci sinh bt + cosh bt) 


mi = m 2 — m 


y{t) = (Cl + Cit)e^ 


7^ complex 


y{t) = + C 2 e"^ 2 t 








or 


mi — a + bi, m 2 = a — bi 


y(^t) = e“^(Di sin bt + cos bt) 




! or 




^ y{t) = e“*[£^i sm{bt E 2 )] 




or 




y{t) = e<^^[Ficos{bt + F2)] 



Table A. 2: General Solution Forms for Second-Order Linear 
Homogeneous Equations, Constant Coefficients 



A. 4 Homogeneous Equations of Order n 



When (A,2) is of order n, the auxiliary equation p{m) = 0 has n roots, 
when multiple roots are counted according to their multiplicity. Also, 
complex roots occm- in conjugate pairs. The general solutions of the 
homogeneous equations is the sum of the solutions associated with each 
multiple root. They can be found in Table A. 4 for each root and should 
be added together to form the general solution. First, we give some 
examples in Table A. 3. 
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Differential E)quatk>n 


General Solution Form 


1. D2(d 2 - 4D + 4)y = 0 


y(t) = Cl + C 2 t + e2‘(C3 + Cit) 


2. {D - 3)2(D + 5)3(d2 - 4D + 5)2j, = 0 


y{t) e^^(Ci + C 2 t) 
+e“5t(C3 + C4t + C5t2) 

+e^*[(C'6 + C'7)sin t 
+ (C8 + Cgt)oos t] 


3. (D2 -2D + 2)3(D2 -2D- 3)^y = 0 


y(t) = e*[(C'i + C 2 t + Cst‘^) sin t 
+(<74 + Cst + Cet^)cos t] 

+ (C7 + Cst)e^^ + (<7g + Ciot)e~^ 



Table A. 3; Examples of Homogeneous Equations of Order n 



Root 


Multiplicity 


General Solution Form 






yj(t) ^ 


rrij, real 


rj > 1 


yj (0 = (<7i + C 2 t + . . . + Crj f'i 


Complex Conjugate 


BBH 


* (<7i sin bjt + C 2 cos bj t) 


ttj ± bji 


Tj>l 


e*i^[(7j + (72^+ . . . + Crjt^^ ^)sin bjt\ 

+ (Cry + l + Cr^+2i + . . . + C2vjt^^ cos bjt\ 



Table A.4: General Solution Forms for Multiple Roots of Auxiliary 
Equation 



A. 5 Particular Solutions of Linear D.E. with 

Constant Coefficients 

The particular solution to the inhomogeneous equation (A. 2) can usually 
be found by guessing the form of the answer and then verifying the 
guess by substitution. Table A. 5 shows the correct forms for guessing 
for various kinds of forcing functions g(t). Note that the form of the 
guess depends on whether certain numbers are roots of the auxiliary 
equation. Table A.6 gives examples of differential equations along with 
their particular integrals. 
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Forcing Function, g{t) 


K 


Particular Integral, y{t) 


(l)c 


0 


A 


(2) h{t) 


0 


X 


(3) csin qt or ccos qt 


Q 


A sin qt — B cos qt 


(4) ce«* 


Q 




(5) ce^*sin qt or ce^^cos qt 


pPiq 


Ae^^ sin qt — cos qt 


(6) h{t)e^ 


q 




(7) h(t) sin qt or h{t) cos qt 


iq 


X sin qt-\-Y cos qt 


(8) h{t)eP^ s\n qt or h(t)e^^cos qt 


p + iq 


Xe^^ sin qt + cos qt 



Notation. 

(a) In the forcing function column, p, g, and c are given constants and h{f) is a 
given p>olynomial of degree s. 

(b) In the particiJar integral column, A and B are coefficients to be determined 
and 



X — Aq -I- Alt + — + Agt" , Y ~ Bq -f + ... + Bgt^ 
are s degree p>olynomials whose coefficients are to be determined. 

Rules. 

(a) If the number in the K column is not a root of the auxiliary equation p{m) = 0, 
then the prop>er guess for the particular integral is as shown. 

(b) If the nmnber in the K coliunn is a root of the auxiliary equation of degree r, 
then multiply the guess in the last column by f. 

Table A. 5: Particular Solution Forms for Various Forcing Functions 



If the forcing function g{t) is the sum of several fimctions, g = gi + 
g2 + > ‘ ‘-fghy each having one of the forms in the table, then solve for each 
gi separately and add the results together to get the complete solution. 

In the next table, we will apply the formulas and the rules in Table 
A. 6 to obtain particular integrals in specific examples. 
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Differential Equation 


Particular Integral 


lO 

II 

CO 

1 




2. y'^' - 3y” = 1 4- 3t -h 


t^(Ao -1- Alt + A2t^) 


3. j/"- V + 4j/ = 3-t2 


Aq 4" Alt 4 A2t^ 


4. y” — Ay' 4- 4y = 2sin t 


A sin t4Bcos t 


5. y" — Ay' -\-Ay~ 5 sin 2t 


A sin 2t4Bcos 2t 


6. y" — Ay' A-Ay= lOe^^ 


Ae^^ 


7. y" - Ay' + Ay= lOe^* 


i2(Ae^') 


8. y" — Ay' + cos t 


t(Ae^^ sin t 4 cos t) 


9. y" — Ay' -I- 5?/ = sin t 


^®_o(A^4 sin t 4 cos t) 


10. y" — Ay' 4- 5y = cos t 


t 1 4 cos t) 



Table A. 6: Particular Integrals in Specific Examples 

A. 6 Integrating Factor 

Consider the first-order linear equation 

y' + ay = f(t). (A.5) 

If we multiply both sides of the equation by the integrating factor e“^, 
we get 

+ ape^ = e“‘/(<). (A.6) 

Integrating from 0 to t we have 

p(t) = 2/(0)e“‘*‘ + f (A.7) 

Jo 



which is the complete solution (homogeneous solution plus particular 
solution) to the equation. 
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A. 7 Reduction of Higher-Order Linear 

Equations to Systems of First-Order 
Linear Equations 



Another way of solving equation (A.l) is to convert it into a system of 
first-order linear equations. We use the transformations 

^1 = y, ^2 = (A.8) 



so that (A.l) can be written as 







0 


1 


0 


.. 0 




zi 




0 


4 




0 


0 


1 


.. 0 




Z2 




0 


• 


= 


• 


• 


• 






• 


+ 


• 


4-1 




0 


0 


0 


.. 1 




^n-l 




0 


4 




-an 


—an-i 


—an— 2 


. -ai 




Zn 




9 



(A.9) 



In vector form this equation reads 

z' = Az-hb (A. 10) 



with the obvious definitions obtained by comparing (A.9) and (A. 10). 
We will present two ways of solving the first-order system (A. 10). 

The first method involves the matrix exponential function defined 
by the power series 



^tA 



— / tA 4- 



t‘^A^ 

2! 



+ - = E 



k\ 



(A.11) 



It can be shown that this series converges (component by component) 
for all values of t. Also it is differentiable (component by component) 
for all values of t and satisfies 

= (e‘^)A. (A.12) 

aZ/ 

By analogy with Section A.6, we try as the integrating factor for 
(A. 10) to obtain 

(e^^)z' - e-^^Ax = 
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(Note that the order of matrix multiplication here is important.) Using 
the product rule for matrix multiplication of functions, which can be 
shown to be valid, the above equation becomes 

—(e-^^z) = e^*b. 
dV 

Integrating from 0 to t gives 

e-^^z(T)\*o= f e-^H{T)dT. 

Jo 

Evaluating and solving, we have 

z{t) = e^^z(0) + f e~'^^b{T)dr. (A. 13) 

Jo 

The analogy between this equation and (A.6) is clear. 

Although (A. 13) represents a formal expression for the solution of 
(A. 10), it does not provide a computationally convenient way of getting 
explicit solutions. In order to demonstrate such a method we assume 
that the matrix A is diagonalizable, i.e., that there exists a nonsingular 
square matrix P such that 

P-^AP = A. (A.14) 

Here A is the diagonal matrix 

Ai 0 ••• 0 

0 A2 0 

0 0 ••• An 

where the diagonal elements, Ai, . . . , An, are eigenvalues of A. The ith 
column of P is the coiunm eigenvector associated with the eigenvalue A* 
(to see this multiply both sides of (A.14) by P on the left). By looking 
at (A. 11) it is easy to see that 

Suppose we make the following definitions: 

^ = Pw, z{Q) ~ Pw(0), z' = Pw' . 



(A.15) 




(A.17) 
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These in turn imply 

w = p-^z, w(0) = p-^z(0), w' = p-^^z'. (A.18) 

Substituting (A.17) into (A. 10) gives 

Pw' ~ APw + b, 
w' = P~~^APwAP~\ 

which by using (A. 14) gives 

w’ = Aw + P~^h. (A.19) 

Since A is a diagonal matrix, it is easy to solve the homogeneous part of 
(A.19), which is 

w' — Aw. (A. 20) 

The solution is 



Wi = Wi{0)e for i = 1, 

We solve (A.19) completely by multiplying through by the integrating 
factor 



~(e ^^w) = e^^w' — e ^^Aw = e ^6. 

CLv 

Integrating this equation from 0 to t gives 

w{t) = e'^w(O) + /' e~'^^p-h{T)dT. (A.21) 

Jo 

Using the substitutions (A.18) yields 

z(t) = {Pe*^p-'^)z(0) + Pe“ f e~^^p-^b{T)dT, (A.22) 

Jo 

which is the formal solution to (A. 10). Since well-known algorithms are 
available for finding eigenvalues and eigenvectors of a matrix, the solution 
to (A.22) can be foimd in a straightforward manner. 
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A. 8 Solution of Linear Two-Point Boundary 
Value Problems 



In linear-quadratic control problems with linear salvage values (e.g., the 
production-inventory problem in Section 6.1) we require the solution of 
linear two-point boundary value problems of the form 

X 

A 

with boundary conditions 

x(0) = and A(T) = Aj’. (A.24) 

The solution of this system will be of the form (A. 22), which can be 
restated as 



^11 


^12 




X 




^>1 










+ 




^21 


A 22 




A 




h 



(A.23) 



x{t) 




Qii{t) 


Qi2(^) 




a;(0) 


+ 


Ri{t) 


m _ 




Q2l{t) 


Q22{t) 




A(0) 




R2{t) 



(A.25) 



where the A(0) is a vector of unknowns. They can be determined by 
setting 



At - Q2i(T)a;(0) + Q22(T)A(0) + R^iT), 
which is a system of linear equations for the variables A(0). 



(A.26) 



A. 9 Homogeneous Partial Differential 
Equations 

A homogeneous partial differential equation is an equation containing 
one or more partial derivatives of an unknown function with respect to 
its independent variables. If the highest partial derivative appearing ex- 
plicitly in the equation has order n, then the partial differential equation 
is said to be of order n. 

As we saw in the previous sections, the general solutions of ordinary 
differential equations involve expressions containing arbitrary constants. 
Similarly, the solutions of partial differential equations are expressions 
containing arbitrary (differentiable) functions. Conversely, when arbi- 
trary functions can be eliminated algebraically from a given expression, 
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after suitable partial derivatives have been taken, then the result is a 
partial differentiable equation. 

Example A.l Eliminate the arbitrary function / from the expression 
u = f{ax — by), where a and b are non-zero constants. 

Solution. Taking partial derivatives, we have 

Ux = af and Uy = —bf 

so that 

bux + auy = abf — abf = 0. (A.27) 

Here u = f{ax — by) is a solution for the equation bUx + aUy = 0. 

To show that any solution u = g{x,y) can be written in this form, 
'^e set s = ax — by, and define 

G{s,y)=^g =^(^,2/)- 

Then, gx = ^s'§§ — o-Gs &nd gy — Gs^ + Gy = —bGg Gy. 

Since we assume g solves the equation bux + aUy = 0, we have 

0 = bgx — agy = abGs — abGg + aGy — 0, (A.28) 

but this implies Gy = 0 so that G is a frmction of s — ax — by only, 
and hence g[x, y) = G{s) = G{ax — by) is of the required form. We 
conclude that u = f{ax — by) is a general solution form for bux +auy ~ 0. 

Example A. 2 Eliminate the arbitrary functions f\ and /2 from the 
expression u = fi(x)f 2 {y). 

Solution. Taking partial derivatives, we have 

Ux = fj2, % = /i/ 2 , and Uxy = fif 2 

so that 

UUxy - UxUy = /l/2/i/2 “ fif2flf2 = 0* 

As in Example A.l we conclude that u = fi{x)f 2 (y) is the general solu- 
tion form of the equation uUxy — UxUy = 0. 

The subject of partial differential equations is too vast to even sur- 
vey here. However, Table A. 7 gives general solution forms for all the 
homogeneous partial differential equations we wiU consider in this book, 
as well as others. 




374 



A. Solutions of Linear Differential Equations 



Partial Differential 
Exjuation 


General Solution Form 


(1) bux + auy — 0 


u = f{ax — by) 


(2) xux + yuy =0, X ^ 0 


u ^ f{y/x) 


^3^ fC^x ^ 


u = f{xy) 


(4) Ux +Uy = au 


u = fi{x- y)e‘^ + h{x- y)e°'y 


(5) Uz-tUy = au^, k ^ 1 


u^[{k- l){fi{x - y) - 




+\(k-l)h(x-y)-ay)YK‘-''> 


(6) Uxx — a^Uyy — 0 


U = fi{y + oa:) + f^iy - ax) 


(7) Uxx + O^Uyy — 0 


u = h{y + iax) + f 2 {y — iax), i = 


o 

li 


u = h{^)- f2{y) 


(9) UUxy —UxUy =0 


^ = h{x)h{y) 


(10) UUxy +UxUy=(i 


u ^ h{x)j h{y) 


(11) bcUx + acUy + abuz = 0 


u = fi{ax - by) + f 2 {by - cz) + fz{cz - ax) 


(12) xux + yUy + zuz = 0 


fi{x/y) + f 2 {y/z) 


(13) xux ~ yuy + xuz = 0 


u ^ fi{xy) + f^iyz) 


(15) U^Uxyz ~ UxUyU^ = 0 


u ^ fi{x)f 2 {y)Mz) 


(16) Uxyz = 0 


u = fi{x) + f^iy) + fsiz) 


(17) Uxx - a^{Uyy + Uzz) —0 


u = fi{y + ax) + f^{y - ax) + h{z + ax) 




+fi{z - ax) 


(18) Ux + Uy + Uz = au 


u = fi{x- y)e‘^ +f 2 (y- z)e‘^y + /s (2 - x)e‘^^ 



Note. The function fi are arbitrary differentiable functions of a single variable; 
a, b, c, . . . stand for arbitrary (non-zero) constants. 

Table A. 7: General Solution Forms for Some Homogeneous Partial 
Differential Equations 

A. 10 Inhomogeneous Partial Differential 

Equations 



As in the ordinary case, an inhomogeneous partial differential equation 
is obtained from a homogeneous one by adding one or more forcing 
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functions. The general case of this problem is too difficult to treat here. 
We consider only the case in which the forcing functions are separable, 
i.e., can be written as a sum of functions each involving only one of the 
independent variables. In solving such problems we can make use of the 
solutions to ordinary differential equations considered earher. 

Example A. 3 Solve the partial differential equation 

Ux -\-Uy = + e^. 

Solution. We know from the previous section that the general solution 
to the homogeneous equation is of the form f{x—y). To get the particular 
solutions we solve separately the ordinary equations 

Ux = Sx^ and Uy = e^, 

obtaining solutions x^ and e^. Therefore, the general solution to the 
original equation is 

u = f{x ~y)-\-x^ e^. 

Generally speaking, the above philosophy of finding particular solu- 
tions to separable partial differential equations (when it works) follows 
the same method of “dividing and conquering.” Other methods involve 
the use of series. We will not go further here for lack of space. 

A.ll Solutions of Finite Difference Equations 

In this book we will have uses for finite difference equations only in 
Chapters 8 and 9. For that reason we will give only a brief introduction 
to solution techniques for them. Readers who wish more details can 
consult one of several texts on difference equations; see, e.g., Goldberg 
(1986) or Spiegel (1971). 

If f{k) is a real function of time, then the difference operator applied 
to f is defined as 



A/(fc) = /(fc + l)-/(fc). (A.29) 

The factorial power of k is defined as 

jfcW = k{k -\){k - 2) . . .{k - {n ~ 1)). (A.30) 

It is easy to show that 






(A.31) 
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Because this formula is similar to the corresponding formula for the 
derivative d(k'^)/dk, the factorial powers of k play an analogous role 
for finite differences that the ordinary powers of k play for differential 
calculus. 

If f(k) is a real function of time, then the anti- difference operator 
applied to / is defined as another function g — A~^f{k) with the 
property that 



- f(k). (A.32) 

One can easily show that 

= (l/(n + + c, (A.33) 

where c is an arbitrary constant. Equation (A.33) corresponds to the 
integration formula for powers of k in calculus. 

Note that formulas (A. 31) and (A.33) are similar to, respectively, 
differentiation and integration of the power function k'^ in calculus. By 
analogy with calculus, therefore, we can solve difference equations in- 
volving polynomials in ordinary powers of k by first rewriting them as 
polynomials involving factorial powers of k so that (A.31) and (A.33) 
can be used. We show next how to do this. 

A. 11.1 Changing Polynomials in Powers of k into Facto- 
rial Powers of k 

We first give an abbreviated list of formulas that show how to change 
powers of k into factorial powers of k\ 

fcO _ /^(o) _ jL (by definition), 

k^ ~ k^^\ 

4- 7fe(2) + 

= A;(1) + I5fc(2) + 

The coefficients of the factorial powers on the right-hand sides of these 
equations are called Stirling numbers of the second kind, after the 
person who first derived them. This list can be extended by using a 
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more complete table of these numbers, which can be found in books on 
difference equations cited earlier. 

Example A.4 Express + 4 in terms of factorial powers. 

Solution. Using the equations above we have 

k* = + 6fcP> + ifeW, -3fc = 4 = 4, 



so that 

k* -Zk + A = + 4. 

Example A. 5 Solve the following difference equation in Example 8.7; 

AA* = -it + 5, A® = 0. 

Solution. We first change the right-hand side into factorial powers so 
that it becomes 

AA*’ = + 5. 

Applying (A.33), we obtain 

A* = -(l/2)fc<^) + + c, 

where c is a constant. Applying the condition A° = 0, we find that 
c = —15, so that the solution is 

A* = -(l/2)fc<^> + 5fcW-15. (A.34) 

However, we would like the answer to be in ordinary powers of k. 
The way to do that is discussed in the next section. 

A. 11. 2 Changing Factorial Powers of k into Ordinary 
Powers of k 

In order to change factorial powers of k into ordinary powers of fc, we 
make use of the following formulas; 

k^^^ = A;, 

— A; -I- A;^, 

A;(3) = 2A: - 3A;2 4- A:^ 

A:(4) = -6A: + 11A;2 - 6A;3 + A;^ 
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= 24k - 50k^ + 35k^ - lOit^ + fc®. 

The coefficients of the factorial powers on the right-hand sides of these 
equations are called Stirling numbers of the first kind. This list can also 
be extended by using a more complete table of these numbers, which 
can be found in books on difference equations. 

Solution of Example A.5 Continued. By substituting the first two 
of the above formulas into (A. 34), we see that the desired answer is 

A* = -(l/2)fc^ + (ll/2)A:-15, (A.35) 

which is the solution needed for Example 8.7. 

EXERCISES FOR APPENDIX A 

3 2 5 0 1 1 

A.l If A = , show that A — and P = 

2 3 0 2 1 -1 



1 

Use (A.22) to solve (A. 10) for this data, given that z{tf) = 

2 





3 


3 


6 0 


1 3 


A.2 If A = 




, show that A — 




and P = 




2 


4 


0 1 


1 -2 



0 

Use (A.22) to solve (A. 10) for this data, given that 2r(0) = 

5 



A. 3 After you have read Section 6.1, re-sol ve the production-inventory 
example stated in equations (6.1) and (6.2), (ignoring the control 
constraint {P > 0) by the method of Section A.8. The linear 
two-point boundary value problem is stated in equations (6.6) and 
(6.7). 




Appendix B 

Calculus of Variations and 
Optimal Control Theory 



Here we introduce the subject of the calculus oi variations by analogy 
with the classical topic of maximization and mininiization in calculus; 
see Gelfand and Fomin (1963), Young (1969), and Leitmann (1981) for 
rigorous treatments of the subject. The problem of the calculus of varia- 
tions is that of determining a fimction that maximizes a given functional, 
the objective function. An analogous problem in calculus is that of de- 
termining a point at which a specific function, the objective function, is 
maximum. This, of course, is done by taking the first derivative of the 
function and equating it to zero. This is what is called the first-order 
condition for a maximum. A similar procedure will be employed to de- 
rive the first-order condition for the variational problem. The analogy 
with classical optimization extends also to the second-order maximiza^ 
tion condition of calculus. Finally, we will show the relationship between 
the maximum principle of optimal control theory and the necessary con- 
ditions of the calculus of variations. It is noted that this relationship is 
similar to the one between the Kuhn- Tucker conditions in mathematical 
programming and the first-order conditions in classical optimization. 

We start with the “simplest” variational problem in the next section. 



B.l The Simplest Variational Problem 

Assume a function x : C^[0, where C^[0, T] is a class of fimc- 

tions defined over the interval [0,T] with continuous first derivatives. 
(For simplicity in exposition, assmne x to be a scalar function. The 
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extension to a vector function is straightforward.) Assume further that 
a function in this class is termed admissible if it satisfies the terminal 
conditions 

t(0) = xq and x{T) = xt- (®-1) 

We are thus dealing with a fixed-end-point problem. Examples of admis- 
sible functions for the problem are shown in Figure B.l; see Section 6 and 
Chapters 2 and 3 of Gelfand and Fomin (1963) for problems other than 
the simplest problem, i.e., the problems with other kinds of conditions 
for the end points. 




Figure B.l: Examples of Admissible Functions for the Problem 

The problem under consideration is to obtain the admissible function 
X* for which the functional 



J(x) ~ / g{x,x,t)dt (B.2) 

Jo 

has a relative maximum. We wiU assume that all first and second partial 
derivatives of the function g : x x ^ are continuous. 

B.2 The Euler Equation 

The necessary first-order conditions in classical optimization were ob- 
tained by considering small changes about the solution point. For the 
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variational problem, we consider small variations about the solution func- 
tion. Let x(t) be the solution and let 

y(t) = x(t) + 

where t)(t) : C^[0,T] — ► is an arbitrary continuously differentiable 

function satisfying 

r,(0) = r}(T) = 0, (B.3) 

and £ > 0 is a small number. A sketch of these functions is shown in 
Figure B.2. 





► 



Figure B.2: Variation about the Solution Function 



The value of the objective fimctional associated with y(t) can be 
considered a function of s, i.e., 

V(e) == J (y) = / g(x + erj, x + erj, , t)dt. 

Jo 

However, x{t) is a solution and therefore V{e) must have a maximum at 
£ = 0. This means 

A dV\ 



6J = 



de 



£=0 



- 0 , 



where 6J is known as the variation 8J in J. Differentiating V(e) with 
respect to e and setting e ~ 0 yields 



8J^ 



de 



fT 

= / {dxr} + 9x'h)dt = 0 , 

0 Jo 
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which after integrating the second term by parts provides 



‘-f 



p\l ■ n'T ^ 

= 9xr]dt + {gxr})\l ~ ^ 



er=0 



(B.4) 



Because of the end conditions on 77 , the expression simphfies to 






fT 

~ I [dx 
=:0 JO 



A 

dt 



gx]r)dt ~ 0. 



We now use the fundamental lemma of the calculus of variations 
which states that if is a continuous fimction and Jq h{t)rj{t)dt = 0 for 
every continuous function r}{t)^ then h{t) = 0 for all t G [0, T]. The reason 
that this lemma holds, without going into details of a rigorous proof 
which is available in Gelfand and Fomin (1963), is as follows. Suppose 
that h(t) ^ 0 for some t G [0,T]. Since h{t) is continuous, there is, 
therefore, an interval (^ 1 ,^ 2 ) C [0, T] over which h is nonzero and has the 
same sign. Now selecting rf{t) such 



Tl(t) is 



f 

> 0 , 

0 , 



otherwise. 



it is possible to make the integral Jq h{t)r}(t)dt ^ 0. Thus, by contrar- 
diction, h{t) must be identically zero over the entire interval [ 0 ,T]. 

By using the fundamental lemma, we have the necessary condition 

g. - = 0 (B.5) 

known as the Euler equation, which must be satisfied by a maximal 
solution X*. 

We note that the Euler equation is a second-order ordinary differen- 
tial equation. This can be seen by taking the total time derivative of g± 
and collecting terms: 



^g±x T d:gxx T {.gtx gx) — fi* 

The boundary conditions for this equation are obviously the end-point 
conditions x(0) = xq and x(T) — xjn^ 
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Special Case (i) When g does not depend explicitly on x . 

In this case, the Euler equation (B.5) reduces to 

9x = 0, 

which is nothing but the first-order condition of classical optimization. 
In this case, the dynamic problem is a succession of static classical 
optimization problems. 

Special Case (ii) When g does not depend explicitly on x . 

The Euler equation reduces to 

j^9i = 0, (B.6) 

which we can integrate as 

g± — constant. (B.7) 

Special Case (iii) When g does not depend explicitly on t. 

Finally, we have the important special case in which g is explicitly 
independent of t. In this case, we write the Euler equation (B.5) as 

^{g - ig±) ~ gt = 0- (B.8) 

But gt = 0 and therefore we can solve the above equation as 

g-xg±=^ C, (B.9) 

where C is the constant of the integration. 



B.3 The Shortest Distance Between Two Points 
on the Plane 

The problem is to show that the straight line passing through two points 
on a plane is the shortest distance between the two points. The problem 
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can be stated as follows: 

min / V"l + x^dt 

Jo 

' subject to 

rr(0) = xq and x(T) = xt- 

Here t refers to distance rather than time. Since g — — Vl + does not 
depend explicitly on cc, we are in the second special case and the first 
integral (B.7) of the Euler equation is 

9x = -i(l + = C. 



This implies that i is a constant, which results in the solution 



x{t) = Cit + C 2 , 



where C\ and C 2 are constants. These can be evaluated by imposing 
boundary conditions which give C\ = {xt — xq)/T and C 2 ~ xq. Thus, 



x(t) = 



XT — Xq 



t Xq, 



which is the straight line passing through xq and xt- 



B.4 The Brachistochrone Problem 

The problem arises from the search for the shape of a wire along which 
a bead will slide in the least time from a given point to another, under 
the influence of gravity; see Figure 1.1. 

The Brachistochrone problem has a long history. It was first studied 
(incorrectly) by Galileo in 1630. The problem was correctly posed by 
Johann Bernoulli in 1696 and later solved by Johann Bernoulli, Jacob 
Bernoulli, Newton, and L’Hospital. Note that Euler deduced the Euler’s 
equation in 1744, and we will solve the Brachistochrone problem using 
Euler’s equation. But first we must formulate the problem. 

Assume the bead slides with no friction. Let m denote the mass of the 
bead, s denote the arc length, t denote the horizontal axis, x denote the 
vertical axis (measured vertically down), and r denote the time. Assume 
to = 0, x{to)=-0, T=l, x{T)=^l. 
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We wish to minimize 



J = 




pT' ds 

Jo V ’ 



where v represents velocity, and st is the final displacement measured 
on the curve. We can write 



ds = Vl + x^dt 



and, from elementary physics, it is known that if ?;(^o = 0) = 0 and a 
denotes the gravitational acceleration constant, then 



V — y/2ax. 



Therefore, the variational problem can be stated as 



'1 +x‘^ 



dt 



where x ~ dxjdi (note that i does not denote time), and x(0) = 0 and 
x(l) = 1. Since a is a constant, we can rewrite the problem as 



min|j(o:) = J g(x,xA)dt = j 



1 + 



dt 



X 



Since g does not depend explicitly on t, the problem belongs to the third 
special case. Using the first integral (B.9) of the Euler equation for this 
case, we have 



x\x{l+x‘^)]-^^‘^ 



1 +x^ 



-.1/2 



X 



= C\ (a constant). 



We can reduce this to 



dx 

dt 




- 1 . 



To solve this equation, we separate the variables as 



dt = 



y/xdx 



sJ^/Glx 
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and substitute 



X = (sin^ ^)/^i = (1 — cos 2$) /Cl. (B.IO) 

The resulting expression can be integrated to yield 

t=[e- (1/2) sin2e]/C^ + C 2 . (B.ll) 

The condition 6 = 0 at t = 0 impUes C 2 = 0 providing Cf > 0. The 
value of Cl can be obtained in terms of the value of 0 at t = 1; let this 
be Oi. Then, since a: = 1 at t = 1, we have C\ = sin^, where satisfies 

20i ~ 1 = sin 2^1 — cos 2^i . 

This equation must be solved numerically. An iterative numerical pro- 
cedure yields 9i = 1.206 and therefore Cf = 0.873. Defining cj) = 20, we 
can write (B.IO) and (B.ll) as 

X = 0.573(1 -cos 20), 
t = 0.573(0 — sin 0), 

which are equations of a cycloid in the parametric form. The shape of 
the curve is shown in Figme 1.1 in Chapter 1. 

B.5 The Weierstrass-Erdmann Corner 
Conditions 

So far we have only considered frmctionals defined for smooth curves. 
This is, however, a restricted class of curves which quahfy as solutions, 
since it is easy to give examples of variational problems which have no 
solution in this class. Consider, for example, the objective functional 

min|j(a;)= j x‘^{l — x)‘^dt^ , a:(— 1) = 0, a;(l) = 1. 

The greatest lower bound for J{x) for smooth x = x{t) satisfying the 
boimdary conditions is obviously zero. Yet there is no x € C^[— 1,1] 
with x(— 1) = 0 and x{l) = 1, which achieves this value of J{x). In fact, 
the minimum is achieved for the curve 

1 0 ~l<t<0, 

t, 0<t<l, 
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which has a comer (i.e., a discontinuous first derivative) at i = 0. Such 
a piecewise smooth extremal with corners is called a broken extremal 

We now enlarge the class of admissible functions by relaxing the 
requirement that they be smooth everywhere. The larger class is the class 
of piecewise continuous fimctions which are continuously differentiable 
almost everywhere in [0,T], i.e., except at some points in [0, T\. 

Let x{t) be an extremal with a corner at r 6 [0, T] . Let us decompose 
J{x) as 

J{x) ~ I g{x,x,t)dt = I g{x,x,t)dt + I g{x,x,t)dt 
Jo Jo Jt 

= Ji{x)+J2{x). 

It is clear that on each of the intervals [0, r) and (r, T ] , the Euler equation 
must hold. 

To compute variations 8Ji and ^ J 2 , we must recognize that the two 
‘pieces’ of x are not fixed-end-point problems. We must require that the 
two pieces of x join continuously at t = r; the point t = r can, however, 
move freely as shown in Figure B.3. 




X T 



Figure B.3: A Broken Extremal with Corner at r 

This will require a slightly modified version of formula (B.4) for writ- 
ing out the variations; see pp. 55-56 in Gelfand and Fomin (1963). Equat- 
ing the sum of variations 



6J — SJi -|- 6J2 — 0 
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for x{t) to be an extremal and using the fact that x(t) must be continuous 
at t = r implies 

= 9x\r+ ? (B.12) 

[9 - ^9x]r- =[ 9 - ^9x]r+- (B.13) 

These conditions are called Weierstrass- Erdmann corner conditions, 
which must hold at the point r where the extremal has a corner. 

In each of the interval [0, r) and (r, ^], the extremal x must satisfy the 
Euler equation (B.5). Solving these two equations will provide us with 
four constants of integration since the Euler equations are second-order 
differential equations. These constants can be found from the end-point 
conditions (B.l) and Weierstrass- Erdmann conditions (B.12) and (B.13). 



B.6 Legendre’s Conditions: The Second 
Variation 



The Euler equation is a necessary conditions analogous to the first-order 
condition for a maximum (or minimmn) in the classical optimization 
problems of calculus. The condition analogous to the second-order nec- 
essary condition for a maximum is the Legendre condition 

9xx < 0. (B.14) 



To obtain this condition, we use the second-order condition of classical 
optimization on function V^(£) to be a maximum at £ = 0, i.e., 



dV(s) 



de^ 



£=0 




+ “^gxxV^ + 9xxff)dt < 0. 



(B.15) 



Integrating the middle term by parts and using (B.3), we can transform 
(B.15) into a more convenient form 



Jo 



+ Pri^)dt < 0, 



(B.16) 



where 

Q — Q{i) — 9xx ~ ~^9xx und P = P{j^ — 9xx- 

While it is possible to rigorously obtain (B.14) from (B.16), we will 
only provide a qualitative argument for this. If we consider the quadratic 
functional (B.16) for functions r](t) satisfying t]{Q) — 0, then r]{t) will be 




B.7. Necessary Condition for a Strong Maximum 



389 



small in [0, T] if '^{t) is small in [0, T]. The converse is not true, however, 
since it is easy to construct 7]{t) which is small but has a large derivative 
T](t) in [0,T]. Thus, Prf plays the dominant role in (B.16); i.e., Pif 
can be much larger than Qrp' but it carmot be much smaller (provided 
P ^ 0). Therefore, it might be expected that the sign of the fimctional 
in (B. 6 ) is determined by the sign of the coefficient P(t)^ i,e., (B.16) 
implies (B.14). For a rigorous proof, see Gelfand and Fomin (1963). 

We note that the strengthened Legendre condition (i.e., with a strict 
inequahty in (B.14)), the Euler equation, and one other condition called 
strengthened Jacobi condition are sufficient for a maximum. The reader 
can consult Chapter 5 of Gelfand and Fomin (1963) for details. 



B.7 Necessary Condition for a Strong 
Maximum 

So far we have discussed necessary conditions for a weak maximum. By 
weak maximum we mean that the candidate extremals are smooth or 
piecewise smooth fimctions. The concept of a strong maximum on the 
other hand requires that the candidate extremal need only be continuous 
functions. Without going into details, which are available in Gelfand and 
Fomin (1963), we state a necessary condition for a strong maximmn. This 
is called the Weierstrass necessary condition. The condition is analogous 
to the one in the static case that the objective function be concave. It 
states that if the fimctional (B.2) has a strong maximum for the extremal 
7 satisfying (B.l), then 

£'(a:, rr, t, i;) < 0 (B-17) 

along 7 for every finite v, where E is the Weierstrass Excess Function 
defined as 

E{x^ i, t, v) = g{x, V, t) — g(x, x, t) — gx{x, x, t){v — x). (B.18) 

Note that this condition is always met if g{x,x,t) is concave in x. 

The proof of (B.17) is by contradiction. Suppose there exists a r G 
[ 0 , T] and a vector q such that 

E(r,x{T),x{r),q) > 0, 

where x = x{t) is the equation of the extremal 7 . It is then possible to 
suitably modify 7 to /? which is close to 7 in (7^[0,T] such that 

AJ — / g{x,x,t)dt — / g{x,x,t)dt > 0, 

J/3 Jj 
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contradicting the hypothesis that J(x) has a strong maximum for 7 . 



B.8 Relation to the Optimal Control Theory 



It is possible to derive the necessary conditions of the calculus of varia- 
tions from the maximum principle. This is strongly reminiscent of the 
relationship between the first-order conditions of classical optimization 
and the Kuhn-Tucker conditions of mathematical programming. 

First, we note that the calculus of variations problem can be stated 
as an optimal control problem as follows: 



max 



= f g{x,u,t)dt 

. J 

subject to 



X = u, a:(0) = xq, x{T) = xt, 



The Hamiltonian is 

H{x,u, X,t) = g{x,u,t) Xu (B.19) 

with the adjoint variable A satisfying 

X = -H^ = -g^. (B.20) 

Maximizing the Hamiltonian with respect to u yields 

Hu ~ gi; V X ^ X = —gx’ (B. 2 I) 

Differentiating with respect to time, we have 



This equation with (B.20) implies 

which is the Euler equation of the calculus of variations. 
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The second-order condition for the maximization of the Hamiltonian, 

i.e., 

Huu — 0 ^ 9xx — 
which is the Legendre condition. 

Again, by the maximum principle, if u is an optimal control, then 
H{x^ t, A, t) > H{x, V, A, t), 

where v is any other control. By the definition of the Hamiltonian (B.19) 
and equation (B.21), we have 

g{x, X, t) - gix > g{x, v, t) - g^v, 

which by transposition of terms yields the Weierstrass necessary condi- 
tion 

E{x, X, t, v) = g{x, V, t) - g{x, x, t) - g±{v - i;) < 0. 

We have just proved the equivalence of the maximum principle and 
the Weierstrass necessary condition in the case where Q is open. In cases 
when Q is closed and when the optimal control is on the boundary of H, 
the Weierstrass necessary condition is no longer valid, in general. The 
maximum principle still applies, however. 

Finally, according to the maximum principle, both A and H are con- 
tinuous functions of time. However, 

^ = —9x and H = g — g±x, 

which means that the right-hand sides must be continuous with respect to 
time, i.e., even across comers. These are precisely Weierstrass-Erdmann 
corner conditions. 




Appendix C 



An Alternative Derivation 
of the Maximum Principle 

Recall that in the derivation of the maximum principle in Chapter 2, we 
assumed the twice diflPerentiability of the return function V. Looking at 
(2.32), we can observe that the smoothness assumptions on the return 
function do not arise in the statement of the maximum principle. Also 
since it is not an exogenously given function, there is no a priori reason to 
assume the twice differentiability. In many important cases as a matter 
of fact, V has no derivatives at individual points, e.g., at points on 
switching manifolds. 

In what follows, we wiU give an alternate derivation. This proof fol- 
lows the com-se pointed out by Pontryagin et al. (1962) but with certain 
simplifications. It appears in Fel’dbaum (1965) and, in our opinion, it 
is one of the simplest proofs for the maximmn principle which is not 
related to dynamic programming and thus permits the elimination of 
assumptions about the differentiability of the return function V(t,x). 

We select the Mayer form of the problem (2.5) for deriving the max- 
imum principle in this section. It wiU be convenient to reproduce (2.5) 
here as (C.l): 

I max { J = cx(T)} 

u(t)EQ(t) 



subject to 

X ~ f{x,u,t), cc(0) = Xq. 



(C.l) 
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C.l Needle-Shaped Variation 



Let u*(t) be an optimal control with corresponding state trajectory 
We sketch u*{t) in Figure C.l and x*{t) in Figure C.2 in a scalar case. 
Note that the kink in x*(t) at t = 6 corresponds to the discontinuity in 
u*{t) at t = 6. 



M* 




Figure C.l: Needle-Shaped Variation 



Let r denote any time in the open interval (0,T). We select a suffi- 
ciently small e to insure that r — e > 0 and concentrate our attention on 
this small interval (r — e, r]. We vary the control on this interval while 
keeping the control on the remaining intervals [0, r — e] and (r, T] fixed. 

Specifically, the modified control is 



XyX* 




Figure C.2: Trajectories x*{t) and x{t) in a One-Dimensional Case. 
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I veQ, te(r- 6 , t], 

u(t) = I 

I u*{t), otherwise. 

This is called a needle-shaped variation as shown in Figure C.l. It 
is a jump function and is different from variations in the calculus of 
variations (see Appendix B). Also the difference v—u* is finite and need 
not be small. However, since the variation is on a small time interval, its 
influence on the subsequent state trajectory can be proved to be ‘small’. 
This is done in the following. 

Let the subsequent motion be denoted by x(t) ^ x*{t) ior t > r — e. 
In Figure C.2, we have sketched x(t) corresponding to u{t). 

Let 

5x{t) — x{t) — x*{t), t >T — €, 

denote the change in the state variables. Obviously 6x{r — e) = 0. 
Clearly, 

Sx{r) ^ — cr*(5)], (C.3) 

where s denotes some intermediate time in the interval (r — €,r\. In 
particular, we can write (C.3) as 

8x{r) ~ e[i(r) — x*(r)] + o(£) 

= e[/( 2 r(r),'i;,r) - /(x*(r), u*(r), r] + 0 (e). (C.4) 

But 6x{r) is small since / is assumed to be bounded. Furthermore, since 
/ is continuous and the difference 6x{r) — x{r) ~ x*(r) is small, we can 
rewrite (C.4) as 

6x(t) Rj elf(x*(r),v,T) - /(x*(r),«*(T),r)]. (C.5) 

Since the initial difference 6x(r) is small and since u*(r) does not change 
from t > T on, we may conclude that 6x{t) will be small for all t > r. 
Being small, the law of variation of Sx(t) can be found from linear equar 
tions for small changes in the state variables. These are called variational 
equations. Prom the state equation in (C.l), we have 

= (C.6) 




or 



dx* d{8x) 
dt dt 



f{x*,u*,t)-^ f:j,8x 



(C.7) 
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or using (C.l), 



u*, t)6x, for i > r, (C.8) 

with the initial condition 8x{r) given by (C.5). 

The basic idea in deriving the maximum principle is that equations 
(C.8) are linear variational equations and result in an extraordinary sim- 
phfication. We next obtain the adjoint equations. 



C.2 Derivation of the Adjoint Equation and the 
Maximum Principle 

For this derivation, we employ two methods. The direct method, similar 
to that of Hartberger (1973), is the consequence of directly integrating 
(C.8). The indirect method avoids this integration by a trick which is 
instructive. 

Direct method. Integrating (C.8) we get 



6x(T) ~ 6x(t) A J fx[x*(t),u*{t),t]Sx(t)dt^ (^*9) 

where the initial condition 6x(r) is given in (C.5). 

Since 6x{T) is the change in the terminal state from the optimal state 
x*(T), the change in the objective function 6J must be negative. Thus, 

6J = c6x{T) = cSx{r)+ J cfx[x*{t),u*(t),t]6x(t)dt < 0. (C.IO) 

Furthermore, since (C.8) is a linear homogeneous differential equation, 
we can write its general solution as 

6x{t) = ^{t,T)6x{r), (C.ll) 

where the fundamental solution matrix or the transition matrix ^(f, r) € 
pjnxn obeys 

^$(i,r) = fx[x*{t),u*{t)t]^t,r), ^(r,r) = /, (C.12) 

where / is an n X n identity matrix; see Appendix A. 
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Substituting for 6x(t) from (C.ll) into (C.IO), we have 
6J==c5x(r)~\- j cfx\x*{t),u*{t),t]^{t^T)8x{r)dt <t). (C.13) 

This induces the definition 

cfx[x*{t),u*{t),t]^{t,T)dt-\r c, (C.14) 

which when substituted into (C,13), yields 

SJ^X*{t)6x{t)<0. (C.15) 

But 6x{r) is supplied in (C,5), Noting that e > 0, we can rewrite (C.15) 
as 

- X*{T)f[x*{T),u*{r),T] < 0. (C.16) 

Defining the Hamiltonian for the Mayer form as 

H[x,u,X,t]=Xf{x^u,t), (C.17) 

we can rewrite (C.16) as 

H[x*(r),u*{T),X(r),T] > H[x*{r),v, X{r),r]. (C.18) 

Since this can be done for almost every r, we have the required Hamil- 
tonian maximizing condition. 

The differential equation form of the adjoint equation (C.14) can be 
obtained by taking its derivative with respect to r. Thus, 

-cfx [x* (r ) , w* (r) , r] . (C. 1 9) 

It is also known that the transition matrix has the property: 

= -Hi,r)fx[x*{r),u*(r),T], 
which can be used in (C.19) to obtain 

= - j cfx[x’(t),U*{t),t\^{t,T)f:c[x\T),U*{T),T]dt 

-cfx[x*{T),U*{T),T\. (C.20) 
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Using the definition (C.14) of A(r) in (C.20), we have 

= ->'{T)fx[x*(T),u''{T),T\ 

with A(r) = c, or using (C.17) and noting that r is arbitrary, we have 

X=-XUx\u%t] = X{T) = c. (C.21) 

This completes the derivation of the maximum principle along with the 
adjoint equation using the direct method. 

Indirect method. The indirect method employs a trick which simplifies 
considerably the derivation. Instead of integrating (C.8) explicitly, we 
now assume that the result of this integration yields c5x{T) as the change 
in the state at the terminal time. As in (C.IO), we have 

SJ = c6x(T) < 0. (C.22) 

First, we define 

A(T) = c, (C.23) 

which makes it possible to write (C.22) as 

SJ - c6x{T) = X(T)6x{T) < 0. (C.24) 

Note parenthetically that if the objective function J = S{x(T)), we must 
define A(T) — dS[x{T)]/dx{T) giving us 

= KT)Sx{T).. 

Now, X(T)6x(T) is the change in the objective function due to a 
change Sx(T) at the terminal time T. That is, A(T) is the marginal 
return or the marginal change in the objective function per unit change 
in the state at time T. But 6x(T) cannot be known without integrating 
(C.8). We do know, however, the value of the change 6x{r) at time r 
which caused the terminal change Sx(T) via (C.8). 

We would therefore hke to pose the problem of obtaining the change 
SJ in the objective function in terms of the known value Sx(r); see 
Fel’dbaum (1965). Simply stated, we would like to obtain the marginal 
return A(r) per unit change in state at time r. Thus, 



X{t)Sx{t) = SJ = X{T)Sx{T) < 0. 



(C.25) 
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Obviously, knowing A(r) will make it possible to make an inference about 
SJ which is directly related to the needle-shaped variation applied in the 
small interval (r — e, r]. 

However, since r is arbitrary, om* problem of finding A(r) can be 
translated to one of finding A(t), t G [0,T], such that 

\{t)6x{t) = X(T)6x{T), t e [0, r], (C,26) 



or in other words, 



X{t)Sx{t) = constant, X(T) = c. (C.27) 

It turns out that the differential equation which X(t) must satisfy can 
be easily foimd. Prom (C.27), 

— [X{t)6x{t)] = A— 4- Xdx = 0, (C.28) 

which after substituting for ddxjdt from (C.8) becomes 

XUSx + X6x = {XU + X)dx = 0. (C.29) 

Since (C.29) is true for arbitrary 8x^ we have 

A = -XU = (C.30) 

using the definition (C.17) for the Hamiltonian. 

The Hamiltonian maximizing condition can be obtained by substi- 
tuting for 8x{r) from (C.5) into (C.25). This is the same as what we did 
in (C.15) through (C.18). 

The purpose of the alternative proof was to demonstrate the vahd- 
ity of the maximum principle for a simple problem without knowledge 
of any return function. For more complex problems, one needs compli- 
cated mathematical analysis to rigorously prove the maximum principle 
without making use of return fimctions. A part of mathematical rigor is 
in proving the existence of an optimal solution without which necessary 
conditions are meaningless; see Young (1969). 




Appendix D 

Special Topics in Optimal 
Control 



In this appendix we wiU discuss three specialized topics. These are linear- 
quadratic problems, second-order variations, and singular control. These 
topics are referred to but not discussed in the main body of the text 
because of their advanced nature. While we shall not be able to go into 
a great detail, we will provide an adequate description of these topics 
and list relevant references. 



D.l Linear-Quadratic Problems 

An important problem in systems theory, especially engineering sciences, 
is to synthesize feedback controllers. These controllers provide optimal 
control as a function of the state of the system. A usual method of ob- 
taining these controllers is to solve the Hamilton-Jacobi-Bellman partial 
differential equation (2.19). This equation is nonlinear in general, which 
makes it very difficult to solve in closed form. Thus, it is not possible in 
most cases to obtain optimal feedback control schemes explicitly. 

It is, however, feasible in many cases to obtain perturbation feedback 
control, which refers to control in the vicinity of an optimal path; see 
Bryson and Ho (1969). These perturbation schemes require the approx- 
imation of the problem by a linear-quadratic problem in the vicinity of 
an optimal path (see Section D.2), and feedback control for the approx- 
imating problem is easy to obtain. 

A linear-quadratic control problem is a problem with hnear dynamics 
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and quadratic objective function. More specifically, it is: 

J = —x^Gx H — J {x^Cx + (D.l) 

subject to 

X = Ax + Bu, a:(0) = xq. (D.2) 

The matrices G, G, D, A, and B are in general time dependent. Fur- 
thermore, the matrices G, G, and D are assumed to be negative definite 
and the superscript ^ denotes the transpose operation. Note that this 
problem is a special case of Row (c) of Table 3.3. 

To solve this problem for the explicit feedback controller, we write 
the Hamilton- Jacobi- Bellman equation (2.19) as 

0 = max[H + Vt] = max ■^(a:^Ga; -f Du) -f Vx[Ax -|- Bu] -|- Vt 

(D.3) 

with the terminal boundary condition 

V{x,T) = ^x'^Gx. (D.4) 

The maximization of the maximand in (D.3) can be carried out by taking 
its derivative with respect to u and setting it to zero. Thus, 

= -^ = (^“)^ + = 0 => «^ = (D.5) 

ou ou 

Note that (D.5) is the same as the Hamiltonian maximizing condition. 
Substituting (D.5) in (D.3) and simplifying, we obtain 

0 = l-x'^Cx + V^Ax - (D.6) 

This is a nonlinear partial differential equation of first order and it 
has a solution of the form 

V{x^t) = \-x'^ S{t)x. (D.7) 

Substitution of (D.7) into (D.6) yields 

0 = ix'^[S + 5^1 + A^S - SBD-^B^S + C]x. 




(D.8) 
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Since (D.8) must hold for all x, it implies the following matrix differential 
equation 

S + SA + A^S~SBD^^B'^S + C = 0, (D.9) 

called a matrix Riccati equation^ with the boundary condition 

S{T) = G. (D.IO) 

A solution procedure for Riccati equations appears in Bryson and Ho 
(1969). Once we have the solution S(t) of (D.9) and (D.IO), the optimal 
feedback control can be written as 



ti’(i) = D(t)-^B‘^{t)S(t)x(t). (D.ll) 



A generalization of (D.l), which would be useful in the next section 
on the second variation, is to set 




C N 




X 


\ 

1 




u 



dt. 



(D.12) 



The state equation is given by (D.2). It is possible to derive the optimal 
control for this problem as 



u*{t) = D{ty'^[N'^{t) + B'^{t)S{t)]x{t), (D.13) 



where 

S+SA+A^S-(SB+N)D~^B'^S+N'^)+C = 0, S(T) = G. (D.14) 

For other variations and extensions of the linear-quadratic problem 
(D.l) and (D.2), for which explicit feedback controllers can be developed, 
the reader is referred to Bryson and Ho (1969). 

D.1.1 Certainty Equivalence or Separation Principle 

Suppose equation (D.2) is changed by the presence of a Gaussian white 
noise w{t) and becomes 



X = Ax + Bu + w, 



where 



E[w{t)] = 0, E[w{t)w{r)'^] = Q{t)6{t — r), 
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and o:(0) is a normal random variable with 



£;[i(0)] = 0, £?[a;(0)a:(0)^] = Pq. 



Because of the presence of imcertainty in the system equation, we must 
modify the objective fimction in (D.12) as follows: 



max I J = E 






/ 



V 



C N 
D 



X 



u 



dt 



Assume further that x cannot be directly measured and the measure- 
ment process is given by (13.21), i.e.. 



y{t) = H{t)x{t)+v{t), 

where v{t) is a white noise as defined in (12.72). 

The optimal control u*{t) for this linear-quadratic stochastic optimal 
control problem can be shown to be given by (D.13) with x{t) replaced 
by its estimate x{t); see Bryson and Ho (1969). Thus, 

u*{t) = D{ty^[N'^{i) + B^{t)S{t)]x{t), 



where S is given in (D.14) and x is given by the Kalman- Bucy filter. 

X = Ax + Bu* + w + K[y — Hx] , £ (0) = 0, 

K = 

P - AP + PA^ ~KHP~]rQ, P(0) = Po. 



The above procedure has received two different names in the liter- 
ature. In economics it is called the certainty equivalence principle^ see 
Simon (1956). In engineering and mathematics literature it is called the 
separation principle., Joseph and Tou (1961). When we call it the cer- 
tainty equivalence principle, we are emphasizing the fact that x{t) can be 
used for the purposes of optimal feedback control as if it were the certain 
value of the state variable x{t). Whereas the term separation principle 
emphasizes the fact that the process of determining the optimal control 
can be broken down into two steps: first, estimate x by using the optimal 
filter; second, use that estimate in the optimal feedback control formula 
for the deterministic problem. 
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D.2 Second- Order Variations 

Second-order variations in optimal control theory are analogous to the 
second-order conditions in the classical optimization problem of calculus. 
To discuss the second-order variational condition is difficult when the 
control variable u is constrained to be in the control set fl. So we make 
the simplifying assiimption that and thus the control u is 

unconstrained. As a result, we are now dealing with the problem: 



;|j = F{x,u,t)dt + ^x{T)]^ (D.15) 



subject to 

X = f{x,u,t)^ x{0) = xq. (D.16) 

From Chapter 2, we know that the first-order necessary conditions 
for this problem are given by 

A = \(T) = 0, (D.17) 

Hu = 0, (D.18) 

where the Hamiltonian H is given by 

H = F+Xf. (D.19) 

Since u is unconstrained, these conditions may be easily derived by the 
method of calculus of variations. To see this, we write the augmented 
objective functional as 

J = ^[a;(r)] + r[H{x, u, A, t) - Xx]dt. (D.20) 

Jo 

Consider small perturbation from the extremal path given by (D.16) - 
(D.19) as a result of small perturbations ^2:(0) in the initial state. Define 
the resulting perturbations in state, adjoint, and control variables by 
6x(t), 6X{t), and 6u(t), respectively. These, of course, wifi be obtained 
by linearizing (D.16 - D.18) around the external path: 

dSx 

= fxSx -h fu^u, 6x(0) specified, (D.21) 



d6X 

dt 



{H^^Sxf - S\f - {H^uSu) 



(D.22) 
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SHu = {HuxSxf + S\{HuXf + {HuuSuf 

= {HunSxf + 6\k + {HuuSuY = 0. (D.23) 

Alternatively, we may consider an expansion of the objective function 
and the state equation to second order since the first-order terms vanish 
about a trajectory which satisfies (D,15 - D.18). Prom Bryson and Ho 
(1969), this may be accomplished by expanding (D. 20) to second order 
and all the constraints to first order. Thus, we have 

X 

dt 

u 

(D.24) 

subject to 

— ^ = fx6x + fu^u, 8x{tS) specified. (D.25) 

do 

Since we are interested in a neighboring extremal path, we must deter- 
mine 6u{t) so as to maximize S‘^J subject to (D.25). This problem is 
a linear-quadratic problem discussed in the previous section. For this 
problem, the optimal control 6u*(t) is given by the formula (D.14), pro- 
vided Huu(t) is nonsingular for 0 < t < T. The case when Huu{p) is 
singular for a finite time interval is treated in Section D.3. Thus, rec- 
ognizing that G = C == Hxx, N = Hxu, D = Huu, A = fx, and 
B ~ fui we have 

8u\t) = (D.26) 

where 

S + SU + fl S-(Sfu + H:xu)H-^{flS + H^) + H^x^ = 0, S{T) = 

(D.27) 

While a number of second-order conditions can be obtained by pro- 
ceeding further from this manner, we shall be interested only in the 
concavity condition (or strengthened Legendre- Clebsch condition). It is 
possible to show that neighboring stationary paths exist (in a weak sense; 
i.e,, 6x and 6u are small) if 

Huu{t) <0 for 0 < t < T, (D.28) 

or in other words, Huu(i) is negative semidefinite. First-order conditions, 
conditions (D.28), and the condition that S{t) is finite for 0 < t < T 



6‘^J = ^{Sx'^^xxSx)t=T + ^ 

2 2 Jo 



Hxx 


B^XU 


Bux 


B\iu 
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represent sufficient conditions for a trajectory to be a local maximum. 
We are not being specific here because in this book we would be relying 
mostly on the sufficiency conditions developed in Chapters 2-4, which are 
based on certain concavity requirements. We are stating (D.28) because 
of its similarity to the second-order condition for a local maximum in 
the classical maximization problem. 

We must note that 



Hu = 0 and Huu<0 (D.29) 

form necessary conditions for a trajectory to be a local maximum. 



D.3 Singular Control 



In some optimization problems including some problems treated in this 
text, extremal arcs satisfying Hu ~0 occur on which the matrix Huu is 
singular. Such arcs are called singular arcs. Note that these arcs sat- 
isfy (D.29) but not the strengthened condition (D.28). While no general 
sufficiency conditions are available for singular arcs, some additional nec- 
essary conditions known as the generalized Legendre-Clebsch conditions 
have been developed. A good reference on singular control is Bell and 
Jacobson (1975). 

We shall only discuss the case in which the Hamiltonian is linear in 
one or more of the control variables. For these systems, Hu = ^ implies 
that the coefficient of the linear control term in the Hamiltonian vanishes 
identically along a singular arc. Thus, the control is not determined in 
terms of x and A by the Hamiltonian maximizing condition Hu = 0. 
Instead, the control is determined by the requirement that the coefficient 
of these linear terms remain zero on the singular arc. That is, the time 
derivatives of Hu must be zero. Having obtained the control by setting 
dHu/dt = 0 (or by setting higher time derivatives to equal zero) along the 
singular arc, we must check additional necessary conditions analogous to 
the second-order condition (D.28). For a maximization problem with a 
single control variable, these conditions turn out to be 



NJt 9 






(f^Hu 



dt^^ 



< 0 , 



The conditions (D.30) are called the 
conditions. 



fc = 0,l,2,.... (D.30) 

generalized Legendre-Clebsch 
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Example D.l We present an example treated by Johnson and Gibson 



(1963): 


max 1 J = — i J x\d^ 


(D.31) 


subject to 


Xi = X2~\-u, a:i(0) = a, 

X2 — —u, x(0) = b, 


(D.32) 

(D.33) 


xi(T) = Xi(T) = 0. 
Solution. We form the Hamiltonian 


(D.34) 




H — ^i(^2 + w) + A2(— w), 


(D.35) 


where the adjoint equations are 






Ai = xi, As = -Ai. 


(D.36) 



The optimal control is bang-bang plus singular. Singular arcs must sat- 
isfy 

if = Ai - As = 0 (D.37) 

for a finite time interval. The optimal control can, therefore, be obtained 

by 

^ = Ai - A2 = o;i + Ai = 0. (D.38) 

Differentiating once more with respect to time t, we obtain 



dt^ 



= Xi + Xi = X2 -\- U -\- Xi = 0, 



which implies 



u — -{xi-\-X2) (D.39) 

along the singular arc. We now verify for the example, the generalized 
Legendre-Clebsch condition (D.30) for A; — 1: 



\ SHu 
du dt^ 



= -1 <0. 



(D.40) 




Appendix E 

Answers to Selected 
Exercises 



Completely worked solutions to all exercises in this book are contained 
in a forthcoming Teachers’ Manual^ which will be made available to 
instructors by the publisher when it is ready. 

Chapter 1 

1.1 (a) Feasible. J = —333,333. 

1.2 J = 36. 

1.3 (a) C- $157, 861/year. 

(b) J — 103.41 utils. 

(c) $15, 000/year. 

1.4 (b) W(20) = 985, 648; J = 104.34. 

1.12 imp(Ci, (? 2 ; t) - (Cl - ^ 2 ) 6 "^^ 

Chapter 2 



2.2 The optimal control is 

[ 2 if0<4 <2-ln2.5, 



u*{t) — < undefined if t = 2 — In 2.5, 



0 



if t > 2 — In 2.5. 
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2.8 u* ~ bang(0, 1; Ai — A 2 ), where \{t) ~ (8e 4e 

2.10 (a) u* = bang[0, 1; (giKi + ^^ 2 ^ 2 )(Ai - A 2 )]. 

(c) i=T- (1/^2) ln[(i^2^i - gih)/{92 ~ 9 i)hl 
2.12 (a) x(lOO) = 30 - 20e~^^ ^ 30. 

(b) u* = S fort G [0,100]. 

1 3 for t G [0, 100-101n2], 

0 otherwise. 

2.14 (a) C*{t) = pWoe^^-P^y(l - e~P'^). 

(b) C*{t) = K(r-p). 

2.17 (a) X = X + SXx^, A(l) = 0, and x = —x^ + A, x(0) = 1. 

2.18 X = f{x) + b{x)u, a?(0) = xq, x{T) = 0. 

u = [b(xfg'(x) - 2c^u{b{x)f{x) - b'{x)f{x)}]/[2c^b{x)]. 

Chapter 3 

3.1 X — Ui > 0, Ui — U2 > 0^ Ui > 0, 1 U2 > 0. 

3.2 a: =[-1,5]. 

3.7 L — F{x, u) + Xf{xj u, t) + pg{x, u, t), 

A = -{a/a)X fi>0, /ig = 0. 

3.11 A(t) = i-1. 

3.12 (a) X(t) = 10 [1 - , 

0 if K = 300, 

-10 1 - if .S' < 300, 

k L 

u*{t) = bang[0, 3; A + 



The problem is infeasible for K > 300. 
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(b) r* = min[0, lOO-ET/3], 
0 fort<t**, 

3 for i > t*\ 



u*(t) = 



3.18 11.87 minutes. 

3.19 w* = _l, T* = 5. 

3.20 u* = -2, T* = 5/2. 

3.29 (a) {/, P, A} = {h -p(S-Pi), S, 2(5 - Pi)}, 

(b) I = h. 



Chapter 4 



4.1 u*{t) = —1, jj.^ = —A = 1/2 — t, {12 = r] = 0. 

4.2 One solution appears in Figure 3.1. Another solution is u(t) = 1/2 
for t G [0,2]. There are many others. 

4.4 (a) u* = 0. 



(c) u* = { 



1, 0<t<l-T, 

0, l-T <t<T. 

(e) J=-(1/8+1/8A:). 

(f) J=-l/8. 

Chapter 5 



5.1 (a) u*{t) = 



5, t<l + 61n 0.99 0.94, 

0, t > 0.094. 



(b) \2{t)/\i{t) = = \ 



-5, 0 < t < 0.28, 

0, 0.28 < t < 0.4, 

5, 0.4 < t < 0.93, 

0, 0.93 < t < 1.0. 
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5.5 «• = !)• = 0 for all t. 

5.7 M* = 0, ■!;• =4/5 for f G [0,49], 

«* = 0, ri* = 0 for t € [49, 60] , 



J* = 34,420. 



5.10 (b) /(r) = f - 101n(l - 0.36“ “*). 

(c) t* = 1.969327, J(r) = 19.037. 

Chapter 6 

6.5 Q{t) = t*~ 160*3 1740^2 _ 7300^ + 9339. 

6.7 V* = sat[-V2, Vi; (Aa - \iv)WM\- 
6.9 J* = 10.56653. 



6.10 v*(i) sa 3e-®‘, y*{t) fa 1 - Ze~^K 



0, 0 < * < 7/3, 



6.12 u*{t) = 



2, 7/3 < * < 3, 

-1, 3 <*<13/3, 



0, 13/3 < * < 6. 



I ^ + 1 1 ^ ^ [0> l]i 

[ 0, *G(0,3]. 



M2 — 



0 , *€[ 0 , 1 . 8 ), 

~^*+§, *€[1.8,3]. 



I 0, *g[0,1)U(1.8,3], 

[ -|* + |, *€[ 1 , 1 . 8 ). 
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6.14 

6.15 

6.17 

7.1 

7.7 

7.10 

7.12 

7.17 



(a) v*{t) = 



-1 

1 



for t G [0, 1.8), 
for t e (1.8, 3]. 



(b) V* (t) = 1 for t G [0, 10] . 

£ 

-1 for < G [0,1/2], 

0 for f € (1/2, ti], where = 23/12, 

tJ* = < 

+1 for ^ G + 1/2], 

0 for t G {t\ 4- 1/2, 4]. 




0 , 

h{t~ti)/c, 



for 0 < i , 
for t\ < t <T, 



where ti = T — y/2BCjh. 

Chapter 7 



p* = 102.5 + 0.2G. 

{u)/{pS) = (Sl3)/ {r){p + 5)). 
G 4“ 6G = bang[0, 005 A 4~ 1] , 

-A4-(p4-5)A-7r'(G). 



The equations corresponding to (6.28) and (6.29) can be obtained 
by replacing p by p^fjr. The form of (6.30) remains unchanged. 

(b) 






1 xo 
rQ + 6 



h 



1 In ^ ~ 
rQ 4- ^ X —xt' 



T > ^ In - ^0) - ^^0 I i l„ 

~ rQ + 6 rQ{l — x^) — 6x^ 8 xt * 



7.18 
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7.19 The reachable set is [xqc (^tq — x)e -f a;], 

where x = rQ/{W + rQ). 

7.25 imp(AB;«) = -iln[|5|'. 

7.26 (b) J = 0.6325. 

Chapter 8 

8.1 (a) y ~ z == 3. 

(b) y^2, z = 10. 

8.2 (a) (1,3) is a relative maximum, 

(b) (2,10) is a relative maximum. 

8.3 X = ht)\ X = 80. 

8.6 (a) X = 4 is a local maximum. 

(b) X = 8 is a local maximum and x = 20 is a local and a global 
maximum. 

8.7 (a) (0, 0) is the nearest point. 

(b) (1/2, 1/2) is the nearest point. 

8.8 (l/\/5, 2/\/5) is the closest point. 

8.9 (a) (2 a/2,0). 

(b) (0,2). 

(c) (0,2). 

8.10 Af = dFjdxJ for z = 1, 2, . . . , n; A^_,_i = 1. Note that here T 
denotes the terminal time, and not the transpose operation, 

f 

+1 if > 1, 

8.14 = < -1 if < -1 . where A'= = (/ + 

0 if |A*+^6| < 1. 
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Chapter 9 

9.1 = 5.25, T = 11. 

9.3 T = = 2.47. 

9.4 t^ = 0, T = 30. 

9.5 u*{t) — sat[0, where u^(t) = 2 — e0-05(t-34.8) + t), 

ti^3]t2~T = 34.8. 

Chapter 10 

10.3 X = 0.734. 

10.4 (a) 




(b) For p = 0, X = 220, 000. For = 0.1, a; = 86, 000. For p = oo, 
X — 40, 000. 

10.5 [g\x) -p]\p- c(a:)] - c'{x)g{x) = 0. 

10.7 [g'{x) - p]\p- c{x)] -d{x)g{x) = 0. 

Chapter 11 

11.1 \{t) = where 

_ [KoeP^ + c(l - e^^)//3 - Kt\ {2p - (3) 

^ ePT _ g2{/3-p)T 

K{t) = Koe!^^ + ^(1 - _ e^‘). 

P fJ zp 

Chapter 12 

12.5 0 < r + ;iln(l - §) = i. 

12.8 ti = T/2. 

Chapter 13 

13.5 q*{x) = c*(x) (^p- rp - ^^x, 

X>0, 
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